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1 About This Data Sheet 



This data sheet provides a technical overview of the DIGITAL Alpha 21164 micro- 
processor (called the 21164), including: 

Functional units 

Signal descriptions 

External interface 

Internal processor register (IPR) summary 

Privileged architecture library code (PALcode) instructions 

Electrical characteristics 

Thermal characteristics 

Mechanical packaging 

This data sheet is not intended to provide the reader with everything needed to begin 
chip implementation. For a more comprehensive description of the 21164 and the 
Alpha architecture, refer to documents listed in the Products and Documentation sec- 
tion located at the end of this document. 

Document Conventions 

Throughout this data sheet, the following conventions are used: 

• INT^ refers to NATURALLY ALIGNED groups of n 8-bit bytes. For example: 

- INT 16 — The four least significant address bits are 0. 

- INT8 — The three least significant address bits are 0. 

- INT4 — The two least significant address bits are 0. 

• Values of 1, 0, and X are used in some tables. The X signifies a don't care (1 or 
0) convention, which can be determined by the system designer. 
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2 Alpha 21164 Microprocessor Features 

The 21164 has the following features: 

• Fully pipelined 64-bit advanced RISC architecture supports multiple operating 
systems, including: 

- Microsoft Windows NT 

- DIGITAL UNIX 

- OpenVMS 

366-MHz through 600-MHz operation 
Superscalar 4-way instruction issue 
High-bandwidth (128-bit) interface 
Peak execution rate of 1.7 BIPS 
0.35-|im CMOS technology 
Three onchip caches: 

- 8KB, direct-mapped, LI instruction cache 

- 8KB, dual-ported, direct-mapped, write-through LI data cache 

- 96KB, 3 -way, set-associative, write-back L2 data and instruction cache 

• Supports optional board-level L3 cache ranging from 1MB to 64MB 

• Supports byte and word data types 

• 3.3-V external interface and 2.5-V internal interface 

The 21164 implements IEEE S_floating and T_floating, and VAX F_floating and 
G_floating data types, and supports longword (32-bit) and quadword (64-bit) inte- 
gers. It also provides byte (8-bit) and word (16-bit) support by byte-manipulation 
instructions. Limited hardware support is provided for the VAX D_floating data 
type. 
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3 Microarchitecture 



The 21164 microprocessor is a high-performance implementation of DIGITAL'S 
Alpha architecture. The following sections provide an overview of the chip's archi- 
tecture and major functional units. 

Figure 1 is a block diagram of the 21164. 

The 21164 consists of the following sections (see Figure 1): 

• Instruction fetch/decode and branch unit (IDU) 

• Integer execution unit (lEU) 

• Memory address translation unit (MTU) 

• Cache control and bus interface unit (CBU) 

• Floating-point execution unit (FPU) 

• Data cache (Dcache) 

• Instruction cache (Icache) 

• Secondary cache (Scache) 

• Serial read-only memory (SROM) interface 
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Figure 1 21164 Microprocessor Block/Pipe Flow Diagram 
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Instruction Fetch/Decode and Branch Unit 



3.1 Instruction Fetch/Decode and Branch Unit 

The primary function of the instruction fetch/decode and branch unit (IDU) is to 
manage and issue instructions to the lEU, MTU, and FPU. It also manages the 
instruction cache. The IDU contains: 

• Prefetcher and instruction buffer 

• Instruction slot and issue logic 

• Program counter (PC) and branch prediction logic 

• 48 -entry instruction translation buffers (ITBs) 

• Abort logic 

• Register conflict logic 

• Interrupt and exception logic 

3.1.1 Instruction Prefetch and Decode 

The IDU handles only NATURALLY ALIGNED groups of four instructions 
(INT 16). The IDU does not advance to a new group of four instructions until all 
instructions in a group are issued. If a branch to the middle of an INT 16 group 
occurs, then the IDU attempts to issue the instructions from the branch target to the 
end of the current INT 16; the IDU then proceeds to the next INT 16 of instructions 
after all the instructions in the target INT 16 are issued. Thus, proper code scheduling 
is required to achieve optimal performance. 

3.1.2 Branch Prediction 

The branch unit, or prediction logic, is also part of the IDU. Branch and PC predic- 
tion are necessary to predict and begin fetching the target instruction stream before 
the branch or jump instruction is issued. Each instruction location in the instruction 
cache (Icache) contains a 2-bit history state to record the outcome of branch instruc- 
tions. 

3.1.3 Instruction Translation Buffer 

The IDU includes a 48-entry, fully associative instruction translation buffer (ITB). 
The buffer stores recently used instruction stream (Istream) address translations and 
protection information for pages ranging from 8KB to 512KB and uses a not-last- 
used replacement algorithm. 
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The 21164 provides two optional translation extensions called superpages. Access to 
superpages is allowed only while executing in privileged mode. 

• One superpage maps virtual address bits <39: 13> to physical address bits 
<39:13>, on a one-to-one basis, when virtual address bits <42:41> equal 2. 

• The other superpage maps virtual address bits <29:13> to physical address bits 
<29:13>, on a one-to-one basis, and forces physical address bits <39:30> to 
when virtual address bits <42:30> equal IFFE(hex). 

3.1.4 Interrupts 

The IDU exception logic supports three sources of interrupts: 

• Hardware interrupts 

There are seven level-sensitive hardware interrupt sources supplied by the 
following signals: 

irq_h<3:0> 
sys_mch_chk_irq_h 
pwr_fail_irq_h 
mch_halt_irq_h 

• Software interrupts 

There are 15 prioritized software interrupts sourced by an onchip internal pro- 
cessor register (IPR). 

• Asynchronous system traps 

There are four asynchronous system traps (ASTs) controlled by onchip IPRs. 

Most interrupts can be independently masked in onchip enable registers. In addi- 
tion, AST interrupts are qualified by the current processor mode. All interrupts 
are disabled when the processor is executing PALcode. 

3.2 Integer Execution Unit 

The integer execution unit (lEU) contains two 64-bit integer execution pipelines — 
EO and El, which include the following: 

• Two adders 

• Two logic boxes 

• A barrel shifter 

• Byte-manipulation logic 
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Floating-Point Execution Unit 



• An integer multiplier 

The lEU also includes the 40-entry, 64-bit integer register file (IRF) that contains the 
32 integer registers defined by the Alpha architecture and 8 PALshadow registers. 
The register file has four read ports and two write ports, which provide operands to 
both integer execution pipelines and accept results from both pipes. The register file 
also accepts load instruction results (memory data) on the same two write ports. 

3.3 Floating-Point Execution Unit 

The onchip, pipelined floating-point unit (FPU) can execute both IEEE and VAX 
floating-point instructions. The 21164 supports IEEE S_floating and T_floating data 
types, and all rounding modes. It also supports VAX F_floating and G_floating data 
types, and provides limited support for the D_floating format. The FPU contains: 

• A 32-entry, 64-bit floating-point register file (FRF). 

• A user-accessible control register. 

• A floating-point multiply pipeline. 

• A floating-point add pipeline — The floating-point divide unit is associated with 
the floating-point add pipeline but is not pipelined. 

The FPU can accept two instructions every cycle, with the exception of floating- 
point divide instructions. The result latency for nondivide, floating-point instructions 
is four cycles. 

3.4 Memory Address Translation Unit 

The memory address translation unit (MTU) contains three major sections: 

• Data translation buffer (dual ported) 

• Miss address file (MAF) 

• Write buffer address file 

The MTU receives up to two virtual addresses every cycle from the lEU. The trans- 
lation buffer generates the corresponding physical addresses and access control 
information for each virtual address. The 21164 implements a 43-bit virtual address 
and a 40-bit physical address. 

The MTU can perform read and write operations to/from memory on byte, word, 
longword, and quadword boundaries. 
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Memory Address Translation Unit 



3.4.1 Data Translation Buffer 

The 64-entry, fully associative, dual-read-ported data translation buffer (DTB) stores 
recently used data stream (Dstream) page table entries (PTEs). Each entry supports 
all four granularity hint-bit combinations, so that a single DTB entry can provide 
translation for up to 512 contiguously mapped, 8KB pages. 

The DTB also supports the register-enabled superpage extension. The DTB super- 
page maps provide virtual-to-physical address translation for two regions of the vir- 
tual address space. 

3.4.2 Miss Address File 

The MTU begins the execution of each load instruction by translating the virtual 
address and by accessing the data cache (Dcache). Translation and Dcache tag read 
operations occur in parallel. If the addressed location is found in the Dcache (a hit), 
then the data from the Dcache is formatted and written to either the integer register 
file (IRE) or floating-point register file (ERE). The formatting required depends on 
the particular load instruction executed. If the data is not found in the Dcache (a 
miss), then the address, target register number, and formatting information are 
entered in the miss address file (MAE). 

The MAE performs a load-merging function. When a load miss occurs, each MAE 
entry is checked to see if it contains a load miss that addresses the same Dcache (32- 
byte) block. If it does, and certain merging rules are satisfied, then the new load miss 
is merged with an existing MAE entry. This allows the MTU to service two or more 
load misses with one data fill from the CBU. 

There are six MAE entries for load misses and four more for IDU instruction fetches 
and prefetches. Load misses are usually the highest MTU priority. 

3.4.3 Store Execution 

The Dcache follows a write-through protocol. During the execution of a store 
instruction, the MTU probes the Dcache to determine whether the location to be 
overwritten is currently cached. If so (a Dcache hit), the Dcache is updated. Regard- 
less of the Dcache state, the MTU forwards the data to the CBU. 

A load instruction that is issued one cycle after a store instruction in the pipeline cre- 
ates a conflict if both the load and store operations access the same memory location. 
(The store instruction has not yet updated the location when the load instruction 
reads it.) This conflict is handled by forcing the load instruction to take a replay trap; 
that is, the IDU flushes the pipeline and restarts execution from the load instruction. 
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By the time the load instruction arrives at the Dcache the second time, the conflicting 
store instruction has written the Dcache and the load instruction is executed nor- 
mally. 

Replay traps can be avoided by scheduling the load instruction to issue three cycles 
after the store instruction. If the load instruction is scheduled to issue two cycles after 
the store instruction, then it will be issue-stalled for one cycle. 

3.4.4 Write Buffer 

The MTU also contains a write buffer that has six 32-byte entries. The write buffer 
provides a finite, high-bandwidth resource for receiving store data to minimize the 
number of CPU stall cycles. 

3.5 Cache Control and Bus Interface Unit 

The cache control and bus interface unit (CBU) processes all accesses sent by the 
MTU and implements all memory-related external interface functions, particularly 
the coherence protocol functions for write-back caching. It controls the second-level 
cache (Scache) and the optional board-level backup cache (Bcache). The CBU han- 
dles all instruction and primary Dcache read misses, performs the function of writing 
data from the write buffer into the shared coherent memory subsystem, and has a 
major role in executing the Alpha memory barrier (MB) instruction. The CBU also 
controls the 128-bit bidirectional data bus, address bus, and I/O control. 

3.6 Cache Organization 

The 21164 has three onchip caches — a primary LI data cache, a primary LI instruc- 
tion cache, and a second-level L2 combined data and instruction cache. All memory 
cells in the onchip caches are fully static, 6-transistor, CMOS structures. 

The 21164 also provides control for an optional board-level, external L3 cache. 

3.6.1 Data Cache 

The data cache (Dcache) is a dual-read-ported, single- write-ported, 8KB cache. It is 
a write-through, read-allocate, direct-mapped, physical cache with 32-byte blocks. 

3.6.2 Instruction Cache 

The instruction cache (Icache) is an 8KB, virtual, direct-mapped cache with 32-byte 
blocks. Each block tag contains: 
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Serial Read-Only Memory Interface 

• A 7-bit address space number (ASN) field as defined by the Alpha architecture 

• A 1-bit address space match (ASM) field as defined by the Alpha architecture 

• A 1-bit PALcode (physically addressed) indicator 

Software, rather than Icache hardware, maintains Icache coherence with memory. 

3.6.3 Second-Level Cache 

The second-level cache (Scache) is a 96KB, 3-way, set-associative, physical, write- 
back, write-allocate cache with 32-byte or 64-byte blocks. It is a mixed data and 
instruction cache. The Scache is fully pipelined; it processes read and write opera- 
tions at the rate of one INT 16 per CPU cycle and can alternate between read and 
write accesses without bubble cycles. 

When operating in 32-byte block mode, the Scache has 64-byte blocks with 32-byte 
subblocks, one tag per block. If configured to 32 bytes, the Scache is organized as 
three sets of 512 blocks, with each block divided into two 32-byte subblocks. If con- 
figured to 64 bytes, the Scache is three sets of 512 64-byte blocks. 

3.6.4 External Cache 

The CBU implements control for an optional, external, direct-mapped, physical, 
write-back, write-allocate cache with 32-byte or 64-byte blocks. The 21164 supports 
board-level cache sizes of 1MB, 2MB, 4MB, 8MB, 16MB, 32MB, and 64MB. 

3.7 Serial Read-Only Memory Interface 

The serial read-only memory (SROM) interface provides the initialization data load 
path from a system SROM to the instruction cache. Following initialization, this 
interface can function as a diagnostic port by using privileged architecture library 
code (PALcode). 

3.8 Pipeline Organization 

The 21164 has a 7-stage (or 7-cycle) pipeline for integer operate and memory refer- 
ence instructions, and a 9-stage pipeline for floating-point operate instructions. The 
IDU maintains state for all pipeline stages to track outstanding register write opera- 
tions. 

Figure 2 shows the integer operate, memory reference, and floating-point operate 
pipelines for the IDU, FPU, lEU, and MTU. The first four stages are executed in the 
IDU. Remaining stages are executed by the lEU, FPU, MTU, and CBU. 
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Figure 2 Instruction Pipeline Stages 
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4 Pinout and Signal Descriptions 

Sections 4.1 and 4.2 list and describe the 21164 microprocessor external signals and 
their associated pins. 

4.1 Pin Assignment 

The 21164 has 499 pins aligned in an interstitial pin grid array (IPGA) design. 
Table 1 lists the 21164 signal pins and their corresponding pin grid array (PGA) loca- 
tions in alphabetic order. There are 296 functional signal pins, 3 spare (unused) sig- 
nal pins, 39 external power (Vdd) pins, 65 internal power (Vddi) pins, and 96 
ground (Vss) pins. 



Table 1 Alphabetic Signal Pin List 



(Sheet 1 of 5) 



Signal 



PGA 
Location 



Signal 
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Location 



Signal 



PGA 
Location 



addr_h<4> 


BB14 


addr_h<5> 


BC13 


addr_h<6> 


BA13 


addr_h<7> 


AV14 


addr_h<8> 


AW13 


addr_h<9> 


BCll 


addr_h<10> 


BAll 


addr_h<ll> 


AV12 


addr_h<12> 


AWll 


addr_h<13> 


BC09 


addr_h<14> 


BA09 


addr_h<15> 


AVIO 


addr_h<16> 


AW09 


addr_h<17> 


BC07 


addr_h<18> 


BA07 


addr_h<19> 


AV08 


addr_h<20> 


AW07 


addr_h<21> 


BC05 


addr_h<22> 


BC39 


addr_h<23> 


AW37 


addr_h<24> 


AV36 


addr_h<25> 


BA37 


addr_h<26> 


BC37 


addr_h<27> 


AW35 


addr_h<28> 


AV34 


addr_h<29> 


BA35 


addr_h<30> 


BC35 


addr_h<31> 


AW33 


addr_h<32> 


AV32 


addr_h<33> 


BA33 


addr_h<34> 


BC33 


addr_h<35> 


AW31 


addr_h<36> 


AV30 


addr_h<37>i 


BA31 


addr_h<38>^ 


BC31 


addr_h<39> 


BB30 


addr_bus_req_h 


E23 


addr_cmd_par_h 


B20 


addr_res_h<0> 


C27 


addr_res_h<l> 


F26 


addr_res_h<2> 


E27 


big_drv_en_h 


D40 


cack_h 


G21 


cfail_h 


C25 


clk_mode_h<0> 


AU21 


clk_mode_h<l> 


BA23 


clk_mode_h<2> 


BB26 


cmd_h<0> 


F20 
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Pin Assignment 



Table 1 Alphabetic 


; Signal Pi 


nList 
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PGA 




PGA 


Signal 


Location 


Signal 


Location 


Signal 


Location 


cmd_h<l> 


A19 


cmd_h<2> 


C19 


cmd_h<3> 


E19 


cpu_clk_out_h 


BA25 


dack_h 


B24 


data_h<0> 


J43 


data_h<l> 


L39 


data_h<2> 


M38 


data_h<3> 


L41 


data_h<4> 


L43 


data_h<5> 


N39 


data_h<6> 


P38 


data_h<7> 


N41 


data_h<8> 


N43 


data_h<9> 


P42 


data_h<10> 


R39 


data_h<ll> 


T38 


data_h<12> 


R41 


data_h<13> 


R43 


data_h<14> 


U39 


data_h<15> 


V38 


data_h<16> 


U41 


data_h<17> 


U43 


data_h<18> 


W39 


data_h<19> 


W41 


data_h<20> 


W43 


data_h<21> 


Y38 


data_h<22> 


Y42 


data_h<23> 


AA39 


data_h<24> 


AA41 


data_h<25> 


AA43 


data_h<26> 


AB38 


data_h<27> 


AC43 


data_h<28> 


AC41 


data_h<29> 


AC39 


data_h<30> 


AD42 


data_h<31> 


AD38 


data_h<32> 


AE43 


data_h<33> 


AE41 


data_h<34> 


AE39 


data_h<35> 


AG43 


data_h<36> 


AG41 


data_h<37> 


AF38 


data_h<38> 


AG39 


data_h<39> 


AJ43 


data_h<40> 


AJ41 


data_h<41> 


AH38 


data_h<42> 


AJ39 


data_h<43> 


AK42 


data_h<44> 


AL43 


data_h<45> 


AL41 


data_h<46> 


AK38 


data_h<47> 


AL39 


data_h<48> 


AN43 


data_h<49> 


AN41 


data_h<50> 


AM38 


data_h<51> 


AN39 


data_h<52> 


AR43 


data_h<53> 


AR41 


data_h<54> 


AP38 


data_h<55> 


AR39 


data_h<56> 


AU43 


data_h<57> 


AU41 


data_h<58> 


AT38 


data_h<59> 


AU39 


data_h<60> 


AW43 


data_h<61> 


AW41 


data_h<62> 


AV38 


data_h<63> 


AW39 


data_h<64> 


JOl 


data_h<65> 


LOS 


data_h<66> 


M06 


data_h<67> 


L03 


data_h<68> 


LOl 


data_h<69> 


N05 


data_h<70> 


P06 


data_h<71> 


N03 


data_h<72> 


NOl 
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Pin Assignment 



Table 1 Alphabetic 


; Signal Pi 


nList 




( 


'Sheet 3 of 5) 




PGA 




PGA 




PGA 


Signal 


Location 


Signal 


Location 


Signal 


Location 


data_h<73> 


P02 


data_h<74> 


R05 


data_h<75> 


T06 


data_h<76> 


R03 


data_h<77> 


ROl 


data_h<78> 


U05 


data_h<79> 


V06 


data_h<80> 


U03 


data_h<81> 


UOl 


data_h<82> 


W05 


data_h<83> 


W03 


data_h<84> 


WOl 


data_h<85> 


Y06 


data_h<86> 


Y02 


data_h<87> 


AA05 


data_h<88> 


AA03 


data_h<89> 


AAOl 


data_h<90> 


AB06 


data_h<91> 


ACOl 


data_h<92> 


AC03 


data_h<93> 


AC05 


data_h<94> 


AD02 


data_h<95> 


AD06 


data_h<96> 


AEOl 


data_h<97> 


AE03 


data_h<98> 


AE05 


data_h<99> 


AGOl 


data_h<100> 


AG03 


data_h<101> 


AF06 


data_h<102> 


AG05 


data_h<103> 


AJOl 


data_h<104> 


AJ03 


data_h<105> 


AH06 


data_h<106> 


AJ05 


data_h<107> 


AK02 


data_h<108> 


ALOl 


data_h<109> 


AL03 


data_h<110> 


AK06 


data_h<lll> 


AL05 


data_h<112> 


ANOl 


data_h<113> 


AN03 


data_h<114> 


AM06 


data_h<115> 


AN05 


data_h<116> 


AROl 


data_h<117> 


AR03 


data_h<118> 


AP06 


data_h<119> 


AR05 


data_h<120> 


AUOl 


data_h<121> 


AU03 


data_h<122> 


AT06 


data_h<123> 


AU05 


data_h<124> 


AWOl 


data_h<125> 


AW03 


data_h<126> 


AV06 


data_h<127> 


AW05 


data_bus_req_h 


E25 


data_check_h<0> 


J41 


data_check_h<l> 


K38 


data_check_h<2> 


J39 


data_check_h<3> 


G43 


data_check_h<4> 


G41 


data_check_h<5> 


H38 


data_check_h<6> 


G39 


data_check_h<7> 


E43 


data_check_h<8> 


J03 


data_check_h<9> 


K06 


data_check_h<10> 


105 


data_check_h<ll> 


GOl 


data_check_h<12> 


G03 


data_check_h<13> 


H06 


data_check_h<14> 


G05 


data_check_h<15> 


EOl 


data_ram_oe_h 


F22 


data_ram_we_h 


A23 


dc_ok_h 


AU23 


fill_h 


G23 


fill_error_h 


A25 


fill_id_h 


F24 
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Pin Assignment 



Table 1 Alphabetic 


Signal Pi 


nList 






(Sheet 4 of 5) 




PGA 




PGA 




PGA 


Signal 


Location 


Signal 


Location 


Signal 


Location 


fill_nocheck_h 


G25 


idle_bc_h 


A27 


index_h<4> 


A29 


index_h<5> 


C29 


index_h<6> 


F28 


index_h<7> 


E29 


index_h<8> 


B30 


index_h<9> 


A31 


index_h<10> 


C31 


index_h<ll> 


F30 


index_h<12> 


E31 


index_h<13> 


A33 


index_h<14> 


C33 


index_h<15> 


F32 


index_h<16> 


E33 


index_h<17> 


A35 


index_h<18> 


C35 


index_h<19> 


F34 


index_h<20> 


E35 


index_h<21> 


A37 


index_h<22> 


C37 


index_h<23> 


F36 


index_h<24> 


E37 


index_h<25> 


A39 


int4_valid_h<0>^ 


F38 


int4_valid_h<l>^ 


E41 


int4_valid_h<2>^ 


F06 


int4_valid_h<3>^ 


E03 


irq_h<0> 


BA29 


irq_h<l> 


AU27 


irq_h<2> 


BC29 


irq_h<3> 


AW27 


mch_hlt_irq_h 


AU25 


oe_we_active_low_h 


AY40 


osc_clk_in_h 


BC21 


osc_clk_in_l 


BB22 


perf_mon_h 


AW29 


port_mode_h<0> 


AY20 


port_mode_h<l> 


BB20 


pwr_fail_irq_h 


AV26 


ref_clk_in_h 


AW25 


scache_set_h<0> 


C17 


scache_set_h<l> 


A17 


shared_h 


C23 


srom_clk_h 


BA19 


srom_data_h 


BC19 


srom_oe_l 


AW19 


srom_present_l 


AV20 


st_clkl_h 


EOS 


st_clk2_h 


E39 


system_lock_flag_ 


h G27 


sys_clk_outl_h 


AW23 


sys_clk_outl_l 


BB24 


sys_clk_out2_h 


AV24 


sys_clk_out2_l 


BC25 


sys_mch_chk_irq_h 


BA27 


sys_reset_l 


BC27 


tag_ctl_par_h 


F18 


tag_data_h<20> 


A05 


tag_data_h<21> 


E07 


tag_data_h<22> 


F08 


tag_data_h<23> 


C07 


tag_data_h<24> 


A07 


tag_data_h<25> 


E09 


tag_data_h<26> 


FIO 


tag_data_h<27> 


C09 


tag_data_h<28> 


A09 


tag_data_h<29> 


Ell 


tag_data_h<30> 


F12 


tag_data_h<31> 


Cll 


tag_data_h<32> 


All 


tag_data_h<33> 


E13 


tag_data_h<34> 


F14 


tag data h<35> 


C13 


tag_data_h<36> 


A13 


tag_data_h<37> 


B14 


tag_data_h<38> 


E15 


tag_data_par_h 


C15 
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Pin Assignment 



Table 1 Alphabetic Signal Pin List 
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PGA 




PGA 




PGA 


Signal 


Location 


Signal 


Location 


Signal 


Location 


tag_dirty_h 


E17 


tag_ram_oe_h 


C21 


tag_ram_we_h 


A21 


tag_shared_h 


A15 


tag_valid_h 


F16 


tck_h 


AW17 


tdi_h 


BC17 


tdo_h 


BA17 


temp_sense 


AW15 


test_status_h<0> 


BA15 


test_status_h<l> 


AV16 


tms_h 


AV18 


trstJ 


BC15 


victim_pending_h 


E21 


spare 


D04 


spare 


AY04 


spare_io<250> 


AV28 


— 


— 




PGA 










Signal 


Location 











Vss 

Metal plane 6 



A03, A41, AA07, AA37, AC07, AC37, AD04, AD40, AF02, AF42, AG07, 
AG37, AH04, AH40, AL07, AL37, AM04, AM40, AP02, AP42, AR07, 
AR37, AT04, AT40, AU09, AU13, AU17, AU31, AU35, AV02, AV22, 
AV42, AW21, AY08, AY12, AY16, AY22, AY24, AY28, AY32, AY36, B02, 
B06, BIO, B18, B26, B34, B38, B42, BAOl, BA21, BA43, BB02, BB06, 
BBIO, BB18, BB34, BB38, BB42, BC03, BC41, COl, C43, D08, D12, D16, 
D20, D24, D28, D32, D36, F02, F42, G09, G13, G17, G31, G35, H04, H40, 
J07, J37, K02, K42, M04, M40, N07, N37, T04, T40, U07, U37, V02, V42, 
Y04, Y40 

AB04, AB40, AF04, AF40, AK04, AK40, AP04, AP40, AV04, AV40, AY06, 
AYIO, AY14, AY18, AY26, AY30, AY34, AY38, BA03, BA41, C03, C41, 
D06, DIO, D14, D18, D22, D26, D30, D34, D38, F04, F40, K04, K40, P04, 
P40, V04, V40 

AB02, AB42, AE07, AE37, AH02, AH42, AJ07, AJ37, AM02, AM42, AN07, 
AN37, AT02, AT42, AU07, AUll, AU15, AU19, AU29, AU33, AU37, 
AY02, AY42, B04, B08, B12, B16, B22, B28, B32, B36, B40, BA05, BA39, 
BB04, BB08, BB12, BB16, BB28, BB32, BB36, BB40, BC23, COS, C39, 
D02, D42, Gil, G15, G19, G29, G33, G37, H02, H42, L07, L37, M02, M42, 
R07, R37, T02, T42, W07, W37 

^ When byte/word instructions are enabled and addr_h<39> is asserted, addr_h<38:37> become 
transf er_size_h< 1 : 0> . 

When byte/word instructions are enabled and addr_h<39> is asserted, int4_valid_h<3:0> become 
addr_h<3:0>. 



Vdd 

Metal plane 4 



Vddi 

Metal plane 2 
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4.2 21164 Packaging 

Figure 3 shows the 21164 pinout from the top view with pins facing down. 
Figure 3 21164 Top View (Pin Down) 
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21164 Microprocessor Logic Symbol 

Figure 4 shows the 21164 pinout from the bottom view with pins facing up. 
Figure 4 21164 Bottom View (Pin Up) 
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4.3 21164 Microprocessor Logic Symbol 

Figure 5 shows the logic symbol for the 21164 chip. 
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Figure 5 21164 Microprocessor Logic Symbol 
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21164 Signal Names and Functions 

4.4 21164 Signal Names and Functions 

The following table defines the 21164 signal types referred to in this section: 
Signal Type Definition 



B 


Bidirectional 


I 


Input only 





Output only 



The remaining two tables describe the function of each 21164 external signal. 
Table 2 lists all signals in alphanumeric order. This table provides full signal descrip- 
tions. Table 3 lists signals by function and provides an abbreviated description. 

Table 2 21164 Signal Descriptions (Sheet 1 of 12) 

Signal Type Count Description 

addr_h<39:4> B 36 Address bus. These bidirectional signals provide the address of 

the requested data or operation between the 21164 and the sys- 
tem. If addr_h<39> is asserted, then the reference is to non- 
cached, I/O memory space. 

When the byte/word instructions are enabled and addr_h<39> 
is asserted, 6 additional bits of information are communicated 
over the pin bus. Two of the new bits are driven over 
addr_h<38:37>, becoming transfer_size<l:0>, with the fol- 
lowing values: 



00 


Size = 8 bytes 


01 


Size = 4 bytes 


10 


Size = 2 bytes 


11 


Size = 1 byte 



addr_bus_req_h I 1 Address bus request. The system interface uses this signal to 

gain control of the addr_h<39:4>, addr_cmd_par_h, and 
cmd_h<3:0> pins. 

addr_cmd_par_h B 1 Address command parity. This is the odd parity bit on the cur- 

rent command and address buses. The 21164 takes a machine 
check if a parity error is detected. The system should do the 
same if it detects an error. 
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Table 2 21 1 64 Signal Descriptions (Sheet 2 of 12) 

Signal Type Count Description 

addr_res_h<l:0> O 2 Address response bits <1> and <0>. For system commands, the 

21164 uses these pins to indicate the state of the block in the 
S cache: 



Bits Command Meaning 



00 NOP Nothing. 

1 NO ACK Data not found or clean. 

10 ACK/Scache Data from Scache. 

1 1 ACK/Bcache Data from Bcache. 



addr_res_h<2> O 1 Address response bit <2>. For system commands, the 21164 

uses this pin to indicate if the command hits in the Scache or 
onchip load lock register. 

big_drv_en_h I 1 This signal provides the ability to change the output drive char- 

acteristics of index<25:4>, st_clkl_h, st_clk2_h, 
data_ram_oe_h, data_ram_we_h, tag_ram_oe_h, and 
tag_ram_we_h. When asserted, big_drv_en_h increases the 
drive capability of these signals by 50%, eliminating the need 
to buffer these heavily loaded signals. This signal is defined 
during power-up and must not change state during operation. 

cack_h I 1 Command acknowledge. The system interface uses this signal 

to acknowledge any one of the commands driven by the 21164. 

cfail_h I 1 Command fail. This signal has two uses. It can be asserted dur- 

ing a cack cycle of a WRITE BLOCK LOCK command to 
indicate that the write operation is not successful. In this case, 
both cack_h and cfail_h are asserted together. It can also be 
asserted instead of cack_h to force an instruction fetch/decode 
unit (IDU) timeout event. This causes the 21164 to do a partial 
reset and trap to the machine check (MCHK) PALcode entry 
point, which indicates a serious hardware error. 
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Table 2 21 1 64 Signal Descriptions (Sheet 3 of 12) 

Signal Type Count Description 

clk_mode_h<2:0> I 3 Clock test mode. These signals specify a relationship between 

osc_clk_in_h,l and the CPU cycle time. These signals should 
be deasserted in normal operation mode. 



Bits Divisor Description 



000 2 CPU clock frequency is one-half of input 

clock frequency. 

001 1 CPU clock frequency is equal to the input 

clock frequency, but the onchip duty-cycle 
equalizer is disabled. 

010 4 CPU clock frequency is one-fourth of input 

clock frequency. 

Oil — Initialize the CPU clock, allowing the sys- 

tem clock to be synchronized to a stable 
reference clock. 

101 1 CPU clock frequency is equal to input 

clock frequency, and the onchip duty-cycle 
equalizer is enabled. This is the preferred 
mode for normal operation. 

100/1 Ix — Reserved for DIGITAL. 
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Table 2 21164 Signal Descriptions 



(Sheet 4 of 12) 



Signal 



Type Count Description 



cmd_h<3:0> 



B 



Command bus. These signals drive and receive the commands 
from the command bus. The following tables define the com- 
mands that can be driven on the cmd_h<3:0> bus by the 21164 
or the system. 



21164 Commands to System: 


cmd h 
<3:0> 


Command 


Meaning 


0000 


NOP 


Nothing. 


0001 


LOCK 


Lock register address. 


0010 


FETCH 


The 21164 passes a FETCH 
instruction to the system. 


0011 


FETCH_M 


The 21164 passes a FETCH_M 
instruction to the system. 


0100 


MEMORY 
BARRIER 


MB instruction. 


0101 


SET DIRTY 


Dirty bit set if shared bit is 
clear. 


Olio 


WRITE BLOCK 


Request to write a block. 


0111 


WRITE BLOCK 
LOCK 


Request to write a block with 
lock. 


1000 


READ MISSO 


Request for data. 


1001 


READ MISS 1 


Request for data. 


1010 


READ MISS 
MODO 


Request for data; modify 
intent. 


1011 


READ MISS 
MODI 


Request for data; modify 
intent. 


1100 


BCACHE VICTIM 


B cache victim should be 
removed. 
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Table 2 21 1 64 Signal Descriptions (Sheet 5 of 12) 

Signal Type Count Description 



1101 — Reserved. 

1110 READ MISS STCO Request for data; STjc_C data. 

1 1 1 1 READ MISS STC 1 Request for data; STjc_C data. 



System Commands to 21164: 


cmd h 






<3:0> 


Command 


Meaning 


0000 


NOP 


Nothing. 


0001 


FLUSH 


Removes block from caches; 
return dirty data. 


0010 


INVALIDATE 


InvaHdates the block from 
caches. 


0011 


SET SHARED 


Block goes to the shared state. 


0100 


READ 


Read a block. 


0101 


READ DIRTY 


Read a block; set shared. 


0111 


READ DIRTY/INV 


Read a block; invalidate. 



cpu_clk_out_h O 1 CPU clock output. This signal is used for test purposes. 

dack_h I 1 Data acknowledge. The system interface uses this signal to 

control data transfer between the 21164 and the system. 

data_h<127:0> B 128 Data bus. These signals are used to move data between the 

21164, the system, and the B cache. 

data_bus_req_h I 1 Data bus request. If the 21164 samples this signal asserted on 

the rising edge of sysclk n, then the 21164 does not drive the 
data bus on the rising edge of sysclk n-\-l. Before asserting this 
signal, the system should assert idle_bc_h for the correct num- 
ber of cycles. If the 21164 samples this signal deasserted on the 
rising edge of sysclk n, then the 21164 drives the data bus on 
the rising edge of sysclk n-\-l. 



24 Pinout and Signal Descriptions 



21164 Signal Names and Functions 



Table 2 21164 Signal Descriptions 



(Sheet 6 of 12) 



Signal 



Type Count Description 



data_check_h<15:0> B 



data_ram_oe_h 


O 


1 


data_ram_we_h 


O 


1 


dc_ok_h 


I 


1 


fill h 


I 


1 



fill error h 



fill_id_h 



fill_nocheck_h 
idle be h 



16 Data check. These signals set even byte parity or INT8 ECC 
for the current data cycle. 

Data RAM output enable. This signal is asserted for Bcache 
read operations. 

Data RAM write-enable. This signal is asserted for any Bcache 
write operation. 

DC voltage OK. Must be deasserted until dc voltage reaches 
proper operating level. After that, dc_ok_h is asserted. 

Fill warning. If the 21164 samples this signal asserted on the 
rising edge of sysclk n, then the 21164 provides the address 
indicated by fill_id_h to the Bcache on the rising edge of 
sysclk n-\-l. The Bcache begins to write in that sysclk. At the 
end of sysclk n-\-l, the 21164 waits for the next sysclk and then 
begins the write operation again if dack_h is not asserted. 

Fill error. If this signal is asserted during a fill from memory, it 
indicates to the 21164 that the system has detected an invalid 
address or hard error. The system still provides an apparently 
normal read sequence with correct ECC/parity though the data 
is not valid. The 21164 traps to the machine check (MCHK) 
PALcode entry point and indicates a serious hardware error. 
fill_error_h should be asserted when the data is returned. Each 
assertion produces a MCHK trap. 

Fill identification. Asserted with fill_h to indicate which regis- 
ter is used. The 21164 supports two outstanding load instruc- 
tions. If this signal is asserted when the 21164 samples fill_h 
asserted, then the 21164 provides the address from miss regis- 
ter 1. If it is deasserted, then the address in miss register is 
used for the read operation. 

Fill checking off. If this signal is asserted, then the 21164 does 
not check the parity or ECC for the current data cycle on a fill. 

Idle Bcache. When asserted, the 21164 finishes the current 
Bcache read or write operation but does not start a new read or 
write operation until the signal is deasserted. The system inter- 
face must assert this signal in time to idle the Bcache before fill 
data arrives. 



index_h<25:4> 



O 22 Index. These signals index the Bcache. 
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Table 2 21 1 64 Signal Descriptions (Sheet 7 of 12) 

Signal Type Count Description 

int4_valid_h<3:0> O 4 INT4 data valid. During write operations to noncached space, 

these signals are used to indicate which INT4 bytes of data are 
valid. This is useful for noncached write operations that have 
been merged in the write buffer. 



int4_valid_h<3:0> Write Meaning 



XXX 1 data_h<31:0> valid 

XX Ix data_h<63:32> valid 

xlxx data_h<95:64> valid 

Ixxx data_h<127:96> valid 

During read operations to noncached space, these signals indi- 
cate which INT8 bytes of a 32-byte block need to be read and 
returned to the processor. This is useful for read operations to 
noncached memory. 



int4_valid_h<3:0> Read Meaning 



XXX 1 data_h<63:0> valid 

xxlx data_h<127:64> valid 

xlxx data_h<191:128> valid 

Ixxx data_h<255:192> valid 

Note: For both read and write operations, multiple 
int4_valid_h<3:0> bits can be set simultaneously. 

When addr_h<39> is asserted, the int4_valid_h<3:0> signals 
are considered the addr_h<3:0> bits required for byte/word 
transactions. The functionality of these bits is tied to the value 
stored in addr h<38:37>. 
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Signal Type Count Description 

For Read Transactions: 



addr_h 

<38:37> int4 valid h<3:0> Value 



00 Valid INT8 mask 

01 addr_h<3:2> valid on int4_valid_h<3:2>; 
int4_valid<l:0> undefined 

10 addr_h<3:l> valid on int4_valid_h<3:l>; 
int4_valid<0> undefined 

1 1 addr_h<3:0> valid on int4_valid_h<3:0> 

For Write Transactions: 

addr_h 

<38:37> int4_valid_h<3:0> Value 

00 Valid INT4 mask 

01 Valid INT4 mask 

10 addr_h<3:l> valid on int4_valid_h<3:l>; 
int4_valid<0> undefined 

11 addr h<3:0> valid on int4 valid h<3:0> 
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Signal Type Count Description 

irq_h<3:0> I 4 System interrupt requests. These signals have multiple modes 

of operation. During normal operation, these level- sensitive 
signals are used to signal interrupt requests. During initializa- 
tion, these signals are used to set up the CPU cycle time divisor 
for sys_clk_outl_h,l as follows: 



irq_h<3> 


irq_h<2> 


irq_h<1> 


irq_h<0> 


Ratio 


Low 


Low 


High 


High 


3 


Low 


High 


Low 


Low 


4 


Low 


High 


Low 


High 


5 


Low 


High 


High 


Low 


6 


Low 


High 


High 


High 


7 


High 


Low 


Low 


Low 


8 


High 


Low 


Low 


High 


9 


High 


Low 


High 


Low 


10 


High 


Low 


High 


High 


11 


High 


High 


Low 


Low 


12 


High 


High 


Low 


High 


13 


High 


High 


High 


Low 


14 


High 


High 


High 


High 


15 



mch_hlt_irq_h I 1 Machine halt interrupt request. This signal has multiple modes 

of operation. During initialization, this signal is used to set up 
sys_clk_out2_h,l delay. During normal operation, it is used to 
signal a halt request. 

oe_we_active_low_h I 1 This signal provides the ability to control the polarity of the 

offchip cache RAM control signals (data_ram_ oe_h, 
data_ram_we_h9 tag_ram_oe_h, and tag_ram_we_h). When 
this signal is deasserted, the offchip cache signals are asserted 
high. When this signal is asserted, the assertion levels of the 
cache signals are inverted to a low level. This signal is defined 
during power-up and must not change state during operation. 
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Signal 



Type Count Description 



osc_clk_in_h 
osc elk in 1 



perf_mon_h 



port_mode_h<l:0> I 



pwr_fail_irq_h 



ref_clk_in_h 



scache set h<l:0> O 



shared h 



srom_clk_h 







1 


srom_data_h 




I 


1 


srom_oe_l 







1 


srom_present_ 


ji 


B 


1 



Oscillator clock inputs. These signals provide the differential 
clock input that is the fundamental timing of the 21164. These 
signals are driven at the same frequency as the internal clock 
frequency (clk_mode_h<2:0> = 101). 

Performance monitor. This signal can be used as an input to the 
21164 internal performance monitoring hardware from off chip 
events (such as bus activity). 

Select test port interface modes (normal, manufacturing, and 
debug). For normal operation, both signals must be deasserted. 

Power failure interrupt request. This signal has multiple modes 
of operation. During initialization, this signal is used to set up 
sys_clk_out2_h,l delay. During normal operation, this signal is 
used to signal a power failure. 

Reference clock input. Optional. Used to synchronize the tim- 
ing of multiple microprocessors to a single reference clock. If 
this signal is not used, it must be tied to Vdd for proper opera- 
tion. 

Secondary cache set. During a read miss request, these signals 
indicate the Scache set number that will be filled when the data 
is returned. This information can be used by the system to 
maintain a duplicate copy of the Scache tag store. 

Keep block status shared. For systems without a Bcache, when 
a WRITE BLOCK/NO VICTIM PENDING or WRITE 
BLOCK LOCK command is acknowledged, this pin can be 
used to keep the block status shared or private in the Scache. 

Serial ROM clock. Supplies the clock that causes the SROM to 
advance to the next bit. The cycle time of this clock is 128 
times the cycle time of the CPU clock. 

Serial ROM data. Input for the SROM. 

Serial ROM output enable. Supplies the output enable to the 
SROM. 

Serial ROM present. Indicates that SROM is present and ready 
to load the Icache. 
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Signal 



Type Count Description 



st_clkl_h 



O 1 STRAM clock. Clock for synchronously timed RAMs 

(STRAMs). For Bcache, this signal is synchronous with 
index_h<25:4> during private read and write operations, and 
with sys_clk_outl_h,l during read and fill operations. 



st_clk2_h 



tag_ctl_par_h 



tag_data_par_h 



tag_dirty_h 



O 



system_lock_flag_h I 



sys_clk_outl_h 
sys_clk_outl_l 

sys_clk_out2_h 
sys_clk_out2_l 

sys_mch_chk_irq_h I 



sys_reset_l 



o 


1 


o 


1 





1 





1 



B 



tag_data_h<38:20> B 



B 



B 



19 



BC_CONTROL<26> must be set to use this. 

This signal is a duplicate of st_clkl_h, increasing the fanout 
capability of the signal. 

System lock flag. During fills, the 21164 logically ANDs the 
value of the system copy with its own copy to produce the true 
value of the lock flag. 

System clock outputs. Programmable system clock 
(cpu_clk_out_h divided by a value of 3 to 15) is used for 
board-level cache and system logic. 

System clock outputs. A version of sys_clk_outl_h4 delayed 
by a programmable amount from to 7 CPU cycles. 

System machine check interrupt request. This signal has multi- 
ple modes of operation. During initialization, it is used to set 
up sys_clk_out2_h,l delay. During normal operation, it is used 
to signal a machine interrupt check request. 

System reset. This signal protects the 21164 from damage dur- 
ing initial power-up. It must be asserted until dc_ok_h is 
asserted. After that, it is deasserted and the 21164 begins its 
reset sequence. 

Tag control parity. This signal indicates odd parity for 
tag_valid_h, tag_shared_h9 and tag_dirty_h. During fills, the 
system should drive the correct parity based on the state of the 
valid, shared, and dirty bits. 

Bcache tag data bits. This bit range supports 1MB to 64MB 
Beaches. 

Tag data parity bit. This signal indicates odd parity for 
tag_data_h<38:20>. 

Tag dirty state bit. During fills, the system should assert this 
signal if the 21164 request is a READ MISS MOD, and the 
shared bit is not asserted. 
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Signal 



Type Count Description 



tag_ram_oe_h 
tag_ram_we_h 



test status h<l:0> 



tms_h 



trst_r 
victim_pending_h 



O 
O 



o 



1 



tag_shared_h 


B 


1 


tag_valid_h 


B 


1 


tck_h 


I 


1 


tdi_h 


I 


1 


tdo_h 





1 


temp_sense 


I 


1 



I 


1 


B 


1 





1 



Tag RAM output enable. This signal is asserted during any 
B cache read operation. 

Tag RAM write-enable. This signal is asserted during any tag 
write operation. During the first CPU cycle of a write opera- 
tion, the write pulse is deasserted. In the second and following 
CPU cycles of a write operation, the write pulse is asserted if 
the corresponding bit in the write pulse register is asserted. Bits 
BC_WE_CTL<8:0> control the shape of the pulse. 

Tag shared bit. During fills, the system should drive this signal 
with the correct value to mark the cache block as shared. 

Tag valid bit. During fills, this signal is asserted to indicate that 
the block has valid data. 

JTAG boundary-scan clock. 

JTAG serial boundary- scan data-in signal. 

JTAG serial boundary- scan data-out signal. 

Temperature sense. This signal is used to measure the die tem- 
perature and is for manufacturing use only. For normal opera- 
tion, this signal must be left disconnected. 

Icache test status. These signals are used for manufacturing test 
purposes only to extract Icache test status information from the 
chip. test_status_h<0> is asserted if ICSR<39> is true, on 
IDU timeout, or remains asserted if the Icache built-in self-test 
(BiSt) fails. Also, test_status_h<0> outputs the value written 
by PALcode to test_status_h<l> through IPR access. 

JTAG test mode select signal. 

JTAG test access port (TAP) reset signal. 

Victim pending. When asserted, this signal indicates that the 
current read miss has generated a victim. 



This signal is shown as bidirectional. However, for normal operation, it is input only. The output function is 



used during manufacturing test and verification only. 
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Table 3 lists signals by function and provides an abbreviated description. 

Table 3 21164 Signal Descriptions by Function (Sheet 1 of 3) 



Signal 



Type Count Description 



Clocks 



clk_mode_h<2:0> I 3 

cpu_clk_out_h O 1 

osc_clk_in_h4 I 2 

ref_clk_in_h I 1 

st_clkl_h O 1 

st_clk2_h O 1 

sys_clk_outl_h,l O 2 

sys_clk_out2_h,l O 2 

sys_reset_l I 1 



Clock test mode. 
CPU clock output. 
Oscillator clock inputs. 
Reference clock input. 
Bcache STRAM clock output. 
Bcache STRAM clock output. 
System clock outputs. 
System clock outputs. 
System reset. 



Bcache 



big_drv_en_h I 

data_h<127:0> B 

data_check_h<15:0> B 

data_ram_oe_h O 

data_ram_we_h O 

index_h<25:4> O 

oe_we_active_low_h I 

tag_ctl_par_h B 

tag_data_h<38:20> B 

tag_data_par_h B 

tag_dirty_h B 

tag_ram_oe_h O 

tag_ram_we_h O 

tag_shared_h B 



1 Increase drive capability enable. 

128 Data bus. 

16 Data check. 

1 Data RAM output enable. 

1 Data RAM write-enable. 

22 Index. 

1 Assertion-level control signal. 

1 Tag control parity. 

19 Bcache tag data bits. 

1 Tag data parity bit. 

1 Tag dirty state bit. 

1 Tag RAM output enable. 

1 Tag RAM write-enable. 

1 Tag shared bit. 
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Table 3 21164 Signal Descriptions by Function 



addr_h<39:4> 


B 


36 


Address bus. 


addr_bus_req_h 


I 




Address bus request. 


addr_cmd_par_h 


B 




Address command parity. 


addr_res_h<2:0> 





3 


Address response. 


cack_h 


I 




Command acknowledge. 


cfail_h 


I 




Command fail. 


cmd_h<3:0> 


B 


4 


Command bus. 


dack_h 


I 




Data acknowledge. 


data_bus_req_h 


I 




Data bus request. 


fill_h 


I 




Fill warning. 


fill_error_h 


I 




Fill error. 


fill_id_h 


I 




Fill identification. 


fill_nocheck_h 


I 




Fill checking off. 


idle_bc_h 


I 




Idle Bcache. 


int4_valid_h<3 : 0> 





4 


INT4 data valid. 


scache_set_h<l :0> 





2 


Secondary cache set. 


shared_h 


I 




Keep block status shared. 


system_lock_flag_h 


I 




System lock flag. 


victim_pending_h 







Victim pending. 



(Sheet 2 of 3) 



Signal 


Type Count Description 


tag_valid_h 


B 1 Tag vahd bit. 


System Interface 



Interrupts 



irq_h<3:0> 

mch_hlt_irq_h 

pwr_fail_irq_h 



I 4 System interrupt requests. 

I 1 Machine halt interrupt request. 

I 1 Power failure interrupt request. 
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(Sheet 3 of 3) 



Signal 


Type 


Count 


Description 


sys_mch_chk_irq_h 


I 


1 


System machine check interrupt request. 


Test Modes and Miscellaneous 


dc_ok_h 


I 




DC voltage OK. 


perf_mon_h 


I 




Performance monitor. 


port_mode_h< 1 ' 0> 


I 


2 


Select test port interface modes (normal, manufac- 
turing, and debug). 


srom_clk_h 







Serial ROM clock. 


srom_data_h 


I 




Serial ROM data. 


srom_oe_l 







Serial ROM output enable. 


srom_present_l ^ 


B 




Serial ROM present. 


tck_h 


I 




JTAG boundary-scan clock. 


tdi_h 


I 




JTAG serial boundary-scan data in. 


tdo_h 







JTAG serial boundary-scan data out. 


temp_sense 


I 




Temperature sense. 


test_status_h<l:0> 





2 


Icache test status. 


tms_h 


I 




JTAG test mode select. 


trstjl 


B 




JTAG test access port (TAP) reset. 



This signal is shown as bidirectional. However, for normal operation, it is input only. The output 
function is used during manufacturing test and verification only. 
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5 21164 Microprocessor Functional 
Overview 

This section provides an overview of 21164 external signals that support the follow- 
ing: 

Clocks 

B cache interface 

System interface 

Interrupts 

Test modes 

See Figure 1 for a block diagram of the 21164. 
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Clocks 
5.1 Clocks 

The 21164 accepts two clock signal inputs and develops three clock signal outputs: 



Signal 



Description 



Input Clock Signals 



osc_clk_in_h,l Differential inputs normally driven at the desired internal frequency. 

ref_clk_in_h A system-supplied clock to which the 21164 synchronizes its timing 

for multiprocessor systems. 

Output Clock Signals 

cpu_clk_out_h A 21 164 internal clock that may or may not drive the system clock. 

sys_clk_outl_h,l A clock of programmable speed supplied to the external interface. 

sys_clk_out2_h,l A delayed copy of sys_clk_outl_h,l. The delay is programmable and 
is an integer number of cpu_clk_out_h periods. 

Figure 6 shows the 21164 clock signals. 
Figures 21164 Clock Signals 



clock_mode_h<2:0> 
dc ok h 



osc elk in h 



osc elk in I 



ref elk in h 



sys_reset_l 




-X) 





epu_elk_out_h 






sys_elk_out1_h 




0- 


sys_elk_out1_l 




sys_elk_out2_h 




o- 


sys_elk_out2_l 





X) 



LJ-05386.AI4 
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Clocks 



5.1.1 CPU Clock 



The 21164 uses the differential input clock lines osc_clk_in_h,l as a source to gener- 
ate its CPU clock. The input signals clk_mode_h<2:0> control generation of the 
CPU clock. 



5.1.2 System Clock 



The CPU clock is divided by a programmable value of 3 to 15 to generate a system 
clock. The programmable feature allows the system designer maximum flexibility 
when choosing external logic to interface with the 21164. 

The sys_clk_outl_h,l signals are delayed by a programmable number of CPU cycles 
between and 7 to produce sys_clk_out2_h,l. The output of the programmable 
divider is symmetric if the divisor is even. The output is asymmetric if the divisor is 
odd. 

Figure 7 shows the 21164 driving the system clock on a uniprocessor system. 
Figure 7 21164 Uniprocessor Clock 









Memory 
ASIC 




sys_clk_out 




21164 










Bus 
ASIC 











21 1 64 Microprocessor Functional Overview 37 



Clocks 



5.1.3 Reference Clock 

The 21164 provides a reference clock input so that other CPUs and system devices 
can be synchronized in multiprocessor systems. If a clock is asserted on signal 
ref_clk_in_h, then the sys_clk_outl_h,l signals are synchronized to that reference 
clock by means of a digital phase-locked loop (DPLL). Figure 8 shows the 21164 
synchronized to a system reference clock. 

Figure 8 21164 Reference Clock for Multiprocessor Systems 
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5.2 Board-Level Backup Cache Interface 

The 21164 includes an interface and control for an optional board-level backup 
cache (Bcache). This section describes the Bcache interface. The Bcache interface is 
made up of the following: 

• A data bus (which it shares with the system interface) 

• Tag and tag control bits for determining hit and coherence 

• SRAM output and SRAM write control signals 
Figure 9 shows the 21164 system interface signals. 

Figure 9 21164 Bcache Interface Signals 
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The Bcache interface is managed by the cache control and bus interface unit (CBU). 
The Bcache interface is a 128-bit bidirectional data bus. The read and write speed of 
the Bcache can be programmed independently of each other and independently of the 
system clock ratio. Optionally, the Bcache can operate in a psuedo-pipeline manner. 
Internal processor registers are used to program the Bcache timing and to enable 
wave pipelining. See the DIGITAL Alpha 21164 Microprocessor Hardware 
Reference Manual for more information. 



21 1 64 Microprocessor Functional Overview 39 



Board-Level Backup Cache Interface 



The Bcache system supports block sizes of 32 bytes or 64 bytes but it be must set 
like the secondary cache (Scache). The block size is selected by a mode bit. The 
Scache is 3-way, set-associative but is a subset of the larger externally implemented, 
direct-mapped Bcache. In systems with no Bcache, the Scache block size must be set 
to 64 bytes. 

5.2.1 Bcache Victim Buffers 

The 21164 is designed to support systems with one or more offchip Bcache victim 
buffers. External victim buffers improve the overall performance of the Bcache. A 
Bcache victim is generated when the 21164 deallocates a dirty block from the 
Bcache. Each time a Bcache victim is produced, the 21164 stops reading the Bcache 
until the system takes the current victim, and then the Bcache operations resume. 

5.2.2 Cache Coherence Protocol 

Cache coherency is a concern for single and multiprocessor 21164-based systems as 
there may be several caches on a processor module and several more in multiproces- 
sor systems. 

The system hardware designer need not be concerned about Icache and Dcache 
coherency. Coherency of the Icache is a software concern — it is flushed with an 1MB 
(PALcode) instruction. The 21164 maintains coherency between the Dcache and the 
Scache. 

If the system does not have a Bcache, the system designer must create mechanisms 
in the system interface logic to support cache coherency between the Scache, main 
memory, and other caches in the system. 

If the system has a Bcache, the 21164 maintains cache coherency between the 
Scache and the Bcache. The Scache is a subset of the Bcache. In this case, the 
designer must create mechanisms in the system interface logic to support cache 
coherency between the Bcache, main memory, and other caches in the system. 

The following tasks must be performed to maintain cache coherency: 

• The CBU in the 21164 maintains coherency in the Dcache and keeps it as a sub- 
set of the Scache. 

• If an optional Bcache is present, then the 21 164 maintains the Scache as a subset 
of the Bcache. The Scache is set-associative but is kept a subset of the larger 
externally implemented direct-mapped Bcache. 
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• System logic must help the 21164 to keep the Bcache coherent with main mem- 
ory and other caches in the system. 

• The Icache is not a subset of any cache and also is not kept coherent with the 
memory system. 

Table 4 describes the Bcache states that determine cache coherence protocol for 
21164 systems. 

Table 4 Bcache States for Cache Coherency Protocols 
Valid^ Shared^ Dirty^ State of Cache Line 

X X Not valid. 

10 Valid for read or write operations. This cache line contains 

the only cached copy of the block and the copy in memory is 
identical to this line. 

10 1 Valid for read or write operations. This cache line contains 

the only cached copy of the block. The contents of the block 
have been modified more recently than the copy in memory. 

110 Valid for read or write operations. This block may be in 

another CPU's cache. 

111 Valid for read or write operations. This block may be in 

another CPU's cache. The contents of the block have been 
modified more recently than the copy in memory. 

^ The tag_valid_h, tag_shared_h, and tag_dirty_h signals are described in Table 2. 

5.3 System Interface 

The system interface is made up of bidirectional address and command buses, a data 
bus that it shares with the Bcache interface, and several control signals. 

Figure 10 shows the 21164 system interface signals. 
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Figure 10 21164 System Interface Signals 
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The system interface is under the control of the cache control and bus interface unit 
(CBU). The system interface is a 128-bit bidirectional data bus. The cycle time of the 
system interface is programmable to speeds of one-third to one-fifteenth the CPU 
cycle time. All system interface signals are driven or sampled by the 21164 on the 
rising edge of sys_clk_outl_h. 

5.3.1 Commands and Addresses 

The 21164 can take up to two commands from the system at a time. The bus inter- 
face buffer can hold one or two misses and one or two Scache victim addresses at a 
time. A miss occurs when the 21164 searches its caches but does not find the 
addressed block. The 21164 can queue two misses to the system. An Scache victim 
occurs when the 21164 deallocates a dirty block from the Scache. 

The system requests the misses, and the victims arbitrate for the Bcache. 

• The highest priority for the Bcache is data movement for the system, which 
includes fill, read dirty data, invalidate, and set shared activities. 

• If there are no system requests for the Bcache, then a 21 164 command is 
selected. 
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Tables 5 and 6 provide a brief description of the commands that the 21164 and the 
system can drive on the command bus. 

Tables 21164 Commands for the System 



cmd<3:0> Command 



Meaning 



0000 


NOP 


0001 


LOCK 


0010 


FETCH 


0011 


FETCH_M 


0100 


MEMORY BARRIER 


0101 


SET DIRTY 


Olio 


WRITE BLOCK 


0111 


WRITE BLOCK LOCK 


1000 


READ MISSO 


1001 


READ MISS 1 


1010 


READ MISS MODO 


1011 


READ MISS MODI 


1100 


BCACHE VICTIM 


1101 


— 


1110 


READ MISS MOD STCO 


nil 


READ MISS MOD STCl 



Nothing. 

New lock register address. 

21164 passes a FETCH instruction to the system. 

21164 passes a FETCH_M instruction to the system. 

MB instruction. 

Dirty bit set if shared bit is clear. 

Request to write a block. 

Request to write a block with lock. 

Request for data. 

Request for data. 

Request for data; modify intent. 

Request for data; modify intent. 

B cache victim should be removed. 

Spare. 

Request for data, STx_C data. 

Request for data, STx_C data. 



21 1 64 Microprocessor Functional Overview 43 



Interrupts 



Table 6 System Commands for the 21164 



cmd<3:0> Command 



Meaning 



Nothing. 

Remove block from caches; return dirty data (flush proto- 
col). 

Remove the block (write invalidate protocol). 

Block goes to shared state (write invalidate protocol). 

Read a block (flush protocol). 

Read a block; set shared (write invalidate protocol). 

READ DIRTY/INV Read a block; invalidate (write invalidate protocol). 



0000 


NOP 


0001 


FLUSH 


0010 


INVALIDAl'H 


0011 


SET SHARED 


0100 


READ 


0101 


READ DIRTY 


Olio 


READ DIRTY/ 



5.4 Interrupts 



The 21164 has seven interrupt signals that have different uses during initialization 
and normal operation. 

Figure 11 shows the 21164 interrupt signals. 
Figure 11 21 1 64 Interrupt Signals 
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5.4.1 Interrupt Signals During Initialization 

The 21164 interrupt signals work in tandem with the sys_reset_l signal to set the 
values for many of the user-selectable clocking ratios and interface timing parame- 
ters. During initialization, the 21164 reads system clock configuration parameters 
from the interrupt pins. 

Table 7 shows the system clock divisor settings. The system clock frequency is 
determined by dividing the ratio into the CPU clock frequency. 

Table 7 System Clock Divisor 



irq_h<3> irq_h<2> irq_h<1> irq_h<0> Ratio 

Low Low High High 3 

Low High Low Low 4 

Low High Low High 5 

Low High High Low 6 

Low High High High 7 

High Low Low Low 8 

High Low Low High 9 

High Low High Low 10 

High Low High High 11 

High High Low Low 12 

High High Low High 13 

High High High Low 14 

High High High High 15 
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Table 8 shows how the three remaining interrupt signals are used to determine the 
length of the sys_clk_out2 delay. These signals provide flexible timing for system 
use. 

Table 8 System Clock Delay 



sys_mch_ 


.chk. 


Jrq_h 


pwr_fail_ 


Jrq_h 


mch_ 


hit. 


jrq_h 


Delay Cycles 


Low 






Low 




Low 









Low 






Low 




High 






1 


Low 






High 




Low 






2 


Low 






High 




High 






3 


High 






Low 




Low 






4 


High 






Low 




High 






5 


High 






High 




Low 






6 


High 






High 




High 






7 



5.4.2 Interrupt Signals During Normal Operation 

During normal operation, interrupt signals request various interrupts as described in 
Table 2. 

5.5 Test Modes 

Figure 12 shows the 21164 test signals. 
Figure 12 211 64 Test Signals 
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The 21164 test interface port consists of 13 dedicated signals. Table 9 summarizes 
the 21164 test port signals and their function. 

Table 9 21164 Test Port Pins 

Pin Name Type Function 

port_mode_h<l> I Must be false. 

port_mode_h<0> I Must be false. 

srom_present_l I Tied low if serial ROMs (SROMs) are present in system. 

srom_data_h/Rx I Receives SROM or serial terminal data. 

srom_clk_h/Tx O Supplies clock to SROMs or transmits serial terminal data. 

srom_oe_l O SROM enable. 

tdi_h I IEEE 1 149. 1 TDI port. 

tdo_h O IEEE 1 149. 1 TDO port. 

tms_h I IEEE 1 149. 1 TMS port. 

tck_h I IEEE 1 149. 1 TCK port. 

trstj I IEEE 1 149. 1 optional TRST port. 

test_status_h<0> O Indicates Icache BiSt status. 

test_status_h<l> O Outputs an IPR- written value and timeout reset. 

5.5.1 Normal Test Interface Mode 

The test port is in the default or normal test interface mode when the 
port_mode_h<l:0> signals are tied to 00. In this mode, the test port supports the 
following: 

• Serial ROM interface port 

• Serial diagnostic terminal interface port 

• IEEE 1149.1 test access port 

5.5.2 Serial ROM Interface Port 

The following signals make up the serial ROM (SROM) interface: 

srom_present_l 
srom_data_h 
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srom_oe_l 
srom_clk_h 

During system reset, the 21164 samples the srom_present_l signal for the presence 
of SROM. If no SROMs are detected at reset, then srom_present_l is deasserted and 
the SROM load is disabled. The reset sequence clears the Icache valid bits, which 
causes the first instruction fetch to miss the Icache and seek instructions from offchip 
memory. 

If SROMs are present during setup, then the system performs an SROM load as fol- 
lows: 

1. The srom_oe_l signal supplies the output enable to the SROM. 

2. The srom_clk_h signal supplies the clock to the ROM that causes it to advance 
to the next bit. The cycle time of this clock is 126+ times the system clock ratio. 

3. The srom_data_h signal reads the SROM data. 

5.5.3 Serial Terminal Port 

After the serial ROM data is loaded into the Icache, the three SROM load signals 
become parallel I/O pins that can drive a diagnostic terminal such as an RS422. 

5.5.4 IEEE 1149.1 Test Access Port 

The test access port complies with all requirements of the IEEE 1149.1 (JTAG) stan- 
dard. The following signals make up the test access port: 

• tms_h — Test access port select. 

• trst_l — Test access port reset. 

• tck_h — Test access port clock. 

• tdi_h and tdo_h — Input and output for serial boundary-scan, die-ID, bypass, and 
instruction registers. 

5.5.5 Test Status Signals 

The test_status_h signals extract test status information from the chip. 

• The test_status_h<0> signal indicates when the Icache built-in self-test (BiSt) 
fails. 

• The test_status_h<l> signal detects unrepairable Icache by indicating more 
than two failing Icache rows. 
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6 Alpha Architecture Basics 

This section provides some basic information about the Alpha architecture. For more 
detailed information about the Alpha architecture, see tho Alpha Architecture Refer- 
ence Manual. 

6.1 The Architecture 

The Alpha architecture is a 64-bit load and store RISC architecture designed with 
particular emphasis on speed, multiple instruction issue, multiple processors, and 
software migration from many operating systems. 

All registers are 64 bits long and all operations are performed between 64-bit regis- 
ters. All instructions are 32 bits long. Memory operations are either load or store 
operations. All data manipulation is done between registers. 

The Alpha architecture supports the following data types: 

• 8-, 16-, 32-, and 64-bit integers 

• IEEE 32-bit and 64-bit floating-point formats 

• VAX architecture 32-bit and 64-bit floating-point formats 

In the Alpha architecture, instructions interact with each other only by one instruc- 
tion writing to a register or memory location and another instruction reading from 
that register or memory location. This use of resources makes it easy to build imple- 
mentations that issue multiple instructions every CPU cycle. 

The 21164 uses a set of subroutines, called privileged architecture library code 
(PALcode), that is specific to a particular Alpha operating system implementation 
and hardware platform. These subroutines provide operating system primitives for 
context switching, interrupts, exceptions, and memory management. These subrou- 
tines can be invoked by hardware or CALL_PAL instructions. CALL_PAL instruc- 
tions use the function field of the instruction to vector to a specified subroutine. 
PALcode is written in standard machine code with some implementation- specific 
extensions to provide direct access to low-level hardware functions. PALcode sup- 
ports optimizations for multiple operating systems, flexible memory-management 
implementations, and multi-instruction atomic sequences. 

The Alpha architecture performs byte shifting and masking with normal 64-bit, reg- 
ister-to-register instructions and performs single-byte load and store instructions if 
they are enabled by bit <17> of the ICSR. 
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6.2 Addressing 

The basic addressable unit in the Alpha architecture is the 8-bit byte. The 21164 sup- 
ports a 43-bit virtual address. 

Virtual addresses as seen by the program are translated into physical memory 
addresses by the memory-management mechanism. The 21164 supports a 40-bit 
physical address. 

6.3 Integer Data Types 

Alpha architecture supports four integer data types: 

Data Type Description 

Byte A byte is 8 contiguous bits that start at an addressable byte boundary. A 

byte is an 8-bit value. A byte is supported in Alpha architecture by the 
EXTRACT, INSERT, LDBU, MASK, SEXTB, STB, and ZAP instruc- 
tions. 

Word A word is 2 contiguous bytes that start at an arbitrary byte boundary. A 

word is a 16-bit value. A word is supported in Alpha architecture by the 
EXTRACT, INSERT, LDWU, MASK, SEXTW, and STW instructions. 

Longword A longword is 4 contiguous bytes that start at an arbitrary byte boundary. A 

long word is a 32-bit value. A longword is supported in Alpha architecture 
by sign-extended load and store instructions and by longword arithmetic 
instructions. 

Quadword A quadword is 8 contiguous bytes that start at an arbitrary byte boundary. 

A quadword is supported in Alpha architecture by load and store instruc- 
tions and quadword integer operate instructions. 

Note: Alpha implementations may impose a significant performance penalty 

when accessing operands that are not NATURALLY ALIGNED. Refer 
to the Alpha Architecture Reference Manual for details. 
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6.4 Floating-Point Data Types 

The 21164 supports the following floating-point data types: 

• Longword integer format in floating-point unit 

• Quadword integer format in floating-point unit 

• IEEE floating-point formats 

- S_floating 

- T_floating 

• VAX floating-point formats 

- F_floating 

- G_floating 

- D_floating (limited support) 
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7 IEEE Floating-Point Conformance 

The 21164 supports the IEEE floating-point operations as defined by the Alpha 
architecture. Support for a complete implementation of the IEEE Standard for 
Binary Floating-Point Arithmetic (ANSI/IEEE Standard 754 1985) is provided by a 
combination of hardware and software as described in i\\t Alpha Architecture 
Reference Manual. 

Additional information about writing code to support precise exception handling 
(necessary for complete conformance to the standard) is in the Alpha Architecture 
Reference Manual. 

The following information is specific to the 21164: 

• Invalid operation (INV) 

The invalid operation trap is always enabled. If the trap occurs, then the destina- 
tion register is UNPREDICTABLE. This exception is signaled if any VAX 
architecture operand is nonfinite (reserved operand or dirty zero) and the opera- 
tion can take an exception (that is, certain instructions, such as CPYS, never take 
an exception). This exception is signaled if any IEEE operand is nonfinite 
(NAN, INF, denorm) and the operation can take an exception. This trap is also 
signaled for an IEEE format divide of ±0 divided by ±0. If the exception occurs, 
then FPCR<INV> is set and the trap is signaled to the IDU. 

• Divide-by-zero (DZE) 

The divide-by-zero trap is always enabled. If the trap occurs, then the destination 
register is UNPREDICTABLE. For VAX architecture format, this exception is 
signaled whenever the numerator is valid and the denominator is zero. For IEEE 
format, this exception is signaled whenever the numerator is valid and nonzero, 
with a denominator of ±0. If the exception occurs, then FPCR<DZE> is set and 
the trap is signaled to the IDU. 

For IEEE format divides, 0/0 signals INV, not DZE. 

• Floating overflow (OVF) 

The floating overflow trap is always enabled. If the trap occurs, then the destina- 
tion register is UNPREDICTABLE. The exception is signaled if the rounded 
result exceeds in magnitude the largest finite number, which can be represented 
by the destination format. This applies only to operations whose destination is a 
floating-point data type. If the exception occurs, then FPCR<OVF> is set and the 
trap is signaled to the IDU. 
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Underflow (UNF) 

The underflow trap can be disabled. If underflow occurs, then the destination 
register is forced to a true zero, consisting of a full 64 bits of zero. This is done 
even if the proper IEEE result would have been -0. The exception is signaled if 
the rounded result is smaller in magnitude than the smallest finite number that 
can be represented by the destination format. If the exception occurs, then 
FPCR<UNF> is set. If the trap is enabled, then the trap is signaled to the IDU. 
The 21164 never produces a denormal number; underflow occurs instead. 

Inexact (INE) 

The inexact trap can be disabled. The destination register always contains the 
properly rounded result, whether the trap is enabled. The exception is signaled if 
the rounded result is different from what would have been produced if infinite 
precision (infinitely wide data) were available. For floating-point results, this 
requires both an infinite precision exponent and fraction. For integer results, this 
requires an infinite precision integer and an integral result. If the exception 
occurs, then FPCR<INE> is set. If the trap is enabled, then the trap is signaled to 
the IDU. 

The IEEE-754 specification allows INE to occur concurrently with either OVF 
or UNF. Whenever OVF is signaled (if the inexact trap is enabled), INE is also 
signaled. Whenever UNF is signaled (if the inexact trap is enabled), INE is also 
signaled. The inexact trap also occurs concurrently with integer overflow. All 
valid opcodes that enable INE also enable both overflow and underflow. 

If a CVTQL results in an integer overflow (lOV), then FPCR<INE> is automati- 
cally set. (The INE trap is never signaled to the IDU because there is no CVTQL 
opcode that enables the inexact trap.) 

Integer overflow (lOV) 

The integer overflow trap can be disabled. The destination register always con- 
tains the low-order bits (<64> or <32>) of the true result (not the truncated bits). 
Integer overflow can occur with CVTTQ, CVTGQ, or CVTQL. In conversions 
from floating to quadword integer or longword integer, an integer overflow 

occurs if the rounded result is outside the range -2 ..2 . In conversions from 
quadword integer to longword integer, an integer overflow occurs if the result is 

outside the range -2^^ ..1?^~^ . If the exception occurs, then the appropriate bit in 
the FPCR is set. If the trap is enabled, then the trap is signaled to the IDU. 
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• Software completion (SWC) 

The software completion signal is not recorded in the FPCR. The state of this 
signal is always sent to the IDU. If the IDU detects the assertion of any of the 
listed exceptions concurrent with the assertion of the SWC signal, then it sets 
EXC_SUM<SWC>. 

Input exceptions always take priority over output exceptions. If both exception types 
occur, then only the input exception is recorded in the FPCR and only the input 
exception is signaled to the IDU. 
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8 Internal Processor Registers 

The tables in this section provide a summary of the 21164 implementation- specific 
internal processor registers (IPRs). For detailed register information, see the 
DIGITAL Alpha 21164 Microprocessor Hardware Reference Manual. For more 
information about the architecturally specified IPRs, see the Alpha Architecture 
Reference Manual. 

8.1 IDU, MTU, Dcache, and PALtemp IPRs 

Table 10 lists the IDU, MTU, data cache (Dcache), and PALtemp IPRs. These IPRs 
are accessible to PALcode by means of the HW_MTPR and HW_MFPR instruc- 
tions, using the IPR index. The IDU holds a bank of 24 PALtemp registers. 



Table 10 IDU, MTU, Dcache, and PALtemp IPRs 



(Sheet 1 of 4) 



IPR Mnemonic 


Register Name 


Access 


Indexi5 


IDU IPRs 


ISR 


Interrupt Summary 


R 


100 


ITB_TAG 


Istream translation buffer tag 


W 


101 


ITB.PIL 


Instruction translation buffer page 
table entry 


RAV 


102 


ITB_ASN 


Instruction translation buffer address 
space number 


RAV 


103 


ITB_PTE_TEMP 


Instruction translation buffer page 
table entry temporary 


R 


104 


ITBJA 


Instruction translation buffer invali- 
date all 


W 


105 


ITBJAP 


Instruction translation buffer invali- 
date all process 


W 


106 


ITB_IS 


Instruction translation buffer invali- 
date single 


W 


107 


SIRR 


Software interrupt request 


RAV 


108 


ASTRR 


Asynchronous system trap request 


RAV 


109 


ASTER 


Asynchronous system trap enable 


RAV 


lOA 


EXC_ADDR 


Exception address 


RAV 


lOB 
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Table 10 IDU, MTU, Dcache, and PALtemp IPRs 



(Sheet 2 of 4) 



IPR Mnemonic 


Register Name 


Access 


Indexes 


EXC_SUM 


Exception summary 


RAVOC 


IOC 


EXC_MASK 


Exception mask 


R 


lOD 


PAL_BASE 


Privileged architecture library base 
address 


RAV 


lOE 


ICM 


IDU current mode 


RAV 


lOF 


IPLR 


Interrupt priority level 


RAV 


110 


INTID 


Interrupt ID 


R 


111 


IFAULT_VA_FORM 


Formatted faulting virtual address 


R 


112 


IVPTBR 


Virtual page table base 


RAV 


113 


HWINT_CLR 


Hardware interrupt clear 


W 


115 


SL_XMIT 


Serial line transmit 


W 


116 


SL_RCV 


Serial line receive 


R 


117 


ICSR 


IDU control and status 


RAV 


118 


IC_FLUSH_CTL 


Icache flush control 


W 


119 


ICPERR_STAT 


Icache parity error status 


RAVIC 


llA 


PMCTR 


Performance counter 


RAV 


lie 


PALtemp IPRs 



PALtempO 
PALtemp 1 
PALtemp2 
PALtempS 
PALtemp4 
PALtempS 
PALtemp6 
PALtemp? 
PALtempS 



RAV 


140 


RAV 


141 


RAV 


142 


RAV 


143 


RAV 


144 


RAV 


145 


RAV 


146 


RAV 


147 


RAV 


148 
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Table 10 IDU, MTU, Dcache, and PALtemp IPRs 



(Sheet 3 of 4) 



IPR Mnemonic 


Register Name 


Access 


Indexes 


PALtemp9 


— 


RAV 


149 


PALtemp 10 


— 


RAV 


14A 


PALtemp 11 


— 


RAV 


14B 


PALtemp 12 


— 


RAV 


14C 


PALtemp 13 


— 


RAV 


14D 


PALtemp 14 


— 


RAV 


14E 


PALtemp 15 


— 


RAV 


14F 


PALtemp 16 


— 


RAV 


150 


PALtemp 17 


— 


RAV 


151 


PALtemp 18 


— 


RAV 


152 


PALtemp 19 


— 


RAV 


153 


PALtemp20 


— 


RAV 


154 


PALtemp21 


— 


RAV 


155 


PALtemp22 


— 


RAV 


156 


PALtemp23 


— 


RAV 


157 


MTU IPRs 



DTB_ASN 

DTB_CM 

DTB_TAG 
DTB_PTE 

DTB_PTE_TEMP 

MM_STAT 

VA 



Dstream translation buffer address W 

space number 

Dstream translation buffer current W 

mode 



Dstream translation buffer tag 



W 



Dstream translation buffer page table RAV 
entry 

Dstream translation buffer page table R 
entry temporary 

Dstream memory-management fault R 
status 



Faulting virtual address 



R 



200 

201 

202 
203 

204 

205 

206 
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Table 10 IDU, MTU, Dcache, and PALtemp IPRs 



(Sheet 4 of 4) 



IPR Mnemonic 


Register Name 


Access 


Indexes 


VA_FORM 


Formatted virtual address 


R 


207 


MVPTBR 


MTU virtual page table base 


W 


208 


DTB_IAP 


Dstream translation buffer invalidate 
all process 


W 


209 


DTB_IA 


Dstream translation buffer invalidate 
all 


W 


20A 


DTB_IS 


Dstream translation buffer invalidate 
single 


W 


20B 


ALT_MODE 


Alternate mode 


W 


20C 


CC 


Cycle counter 


W 


20D 


CC_CTL 


Cycle counter control 


W 


20E 


MCSR 


MTU control 


RAV 


20F 


DC_FLUSH 


Dcache flush 


W 


210 


DC_PERR_STAT 


Dcache parity error status 


RAVIC 


212 


DC_TEST_CTL 


Dcache test tag control 


RAV 


213 


DC_TEST_TAG 


Dcache test tag 


RAV 


214 


DC_TEST_TAG_TEMP 


Dcache test tag temporary 


RAV 


215 


DC_MODE 


Dcache mode 


RAV 


216 


MAF_MODE 


Miss address file mode 


RAV 


217 



8.2 External Interface Control (CBU) IPRs 

Table 11 summarizes IPRs for controlling Scache, Bcache, system configuration, and 
logging error information. These IPRs cannot be read or written from the system. 
They are placed in the 1MB region of 21 164- specific I/O address space ranging from 
FF FFFO 0000 to FF FFFF FFFF. Any read or write operation to an undefined IPR in 
this address space produces UNDEFINED behavior. The operating system should 
not map any address in this region as writable in any mode. 
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Table 11 CBU Internal Processor Register Descriptions 



Register 


Description 


Type^ 


Address 


SC_CTL 


S cache control 


RW 


FF FFFO 00A8 


SC_STAT 


Scache status 


R 


FF FFFO 00E8 


SC_ADDR 


S cache address 


R 


FF FFFO 0188 


BC_CONTROL 


B cache control 


W 


FF FFFO 0128 


BC_CONFIG 


B cache configuration 


W 


FF FFFO 01C8 


BC_TAG_ADDR 


Bcache tag address 


R 


FF FFFO 0108 


EI_STAT 


External interface status 


R 


FF FFFO 0168 


EI_ADDR 


External interface address 


R 


FF FFFO 0148 


FILL_SYN 


Fill syndrome 


R 


FF FFFO 0068 



^ BC_CONTROL<01> must be when reading any IPR in this table. 

8.3 PALcode Storage Registers 

The 21164 lEU register file has eight extra registers that are called the PALshadow 
registers. The PALshadow registers overlay R8 through R14 and R25 when the CPU 
is in PALmode and ICSR<SDE> is set. Thus, PALcode can consider R8 through 
R14 and R25 as local scratch. PALshadow registers cannot be written in the last two 
cycles of a PALcode flow. The normal state of the CPU is ICSR<SDE> = ON. 
PALcode disables SDE for the unaligned trap and for error flows. 
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9 PALcode 

Privileged architecture library code (PALcode) is macrocode that provides an archi- 
tecturally defined operating-system-specific programming interface that is common 
across all Alpha microprocessors. The actual implementation of PALcode differs for 
each operating system. 

PALcode runs with privileges enabled, instruction stream (Istream) mapping dis- 
abled, and interrupts disabled. PALcode has privilege to use five special opcodes 
that allow functions such as physical data stream (Dstream) references and internal 
processor register (IPR) manipulation. 

PALcode can be invoked by the following events: 

Reset 

System hardware exceptions (MCHK, ARITH) 

Memory-management exceptions 

Interrupts 

CALL_PAL instructions 

9.1 PALcode Entry Points 

PALcode is invoked at specific entry points. The 21164 has two types of PALcode 
entry points: 

• CALL_PAL entry points are used whenever the IDU encounters a CALL_PAL 
instruction in the Istream. 

- Privileged CALL_PAL instructions start at offset 2000^6. 

- Unprivileged CALL_PAL instructions start at offset SOOO^^. 

• Chip-specific trap entry points start PALcode. 
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9.1.1 PALcode Trap Entry Points 

Table 12 shows the PALcode trap entry points and their offset from the PAL_BASE 
IPR. Entry points are listed from highest to lowest priority. 

Table 12 PALcode Trap Entry Points 



Entry Name 


Offset! 6 


Description 


RESET 


0000 


Reset 


lACCVIO 


0080 


Istream access violation or sign check error on PC 


INTERRUPT 


0100 


Interrupt: hardware, software, and AST 


ITBMISS 


0180 


Istream TBMISS 


DTBMISS_SINGLE 


0200 


Dstream TBMISS 


DTBMISS_DOUBLE 


0280 


Dstream TBMISS during virtual page table entry 
(PTE) fetch 


UN ALIGN 


0300 


Dstream unaligned reference 


DFAULT 


0380 


Dstream fault or sign check error on virtual address 


MCHK 


0400 


Uncorrected hardware error 


OPCDEC 


0480 


Illegal opcode 


ARITH 


0500 


Arithmetic exception 


FEN 


0580 


Floating-point operation attempted with: 

• Floating-point instructions (LD, ST, and 
operates) disabled through FPE bit in the 
ICSR IPR 

• Floating-point IEEE operation with data type 
other than S, T, or Q 
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Required PALcode Function Codes 



9.2 Required PALcode Function Codes 

Table 13 lists opcodes required for all Alpha implementations. The notation used is 
oo.ffff, where oo is the hexadecimal 6-bit opcode andj^^^is the hexadecimal 26-bit 
function code. 

Table 13 Required PALcode Function Codes 



IVInemonic 


Type 


Function Code 


DRAINA 


Privileged 


00.0002 


HALT 


Privileged 


00.0000 


1MB 


Unprivileged 


00.0086 



9.3 Opcodes Reserved for PALcode 

Table 14 lists the opcodes reserved by the Alpha architecture for implementation- 
specific use. These opcodes are privileged and are only available in PALmode. 
Section 10.1.2 shows the opcodes reserved for PALcode. 

Table 14 Opcodes Reserved for PALcode 
Opcode Architecture IVInemonic 



PALIF 


PALIE 


PAL 19 


PAL ID 
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This section contains a summary of all Alpha architecture instructions. All values are 
in hexadecimal radix. Table 15 describes the contents of the Format and Opcode col- 
umns that are in Table 16. 

Table 15 Instruction Format and Opcode Notation 



Instruction 
Format 


Format 
Symbol 


Opcode 
Notation 


Meaning 


Branch 


Bra 


oo 


oo is the 6-bit opcode field. 


Floating- 
point 


F-P 


oo.fff 


oo is the 6-bit opcode field. 

j^is the 11 -bit function code field. 


Memory 


Mem 


oo 


oo is the 6-bit opcode field. 


Memory/ 
function code 


Mfc 


oo.m 


oo is the 6-bit opcode field. 

j^is the 16-bit function code in the 

displacement field. 


Memory/ 
branch 


Mbr 


oo.h 


oo is the 6-bit opcode field. 

h is the high-order 2 bits of the displacement 

field. 


Operate 


Opr 


oo.ff 


oo is the 6-bit opcode field, 
j^is the 7-bit function code field. 


PALcode 


Pcd 


oo 


oo is the 6-bit opcode field; the particular 
PALcode instruction is specified in the 26-bit 
function code field. 
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Qualifiers for operate instructions are shown in Table 16. Qualifiers for IEEE and 
VAX floating-point instructions are shown in Tables 19 and 20, respectively. 



Table 16 Archi 


lecture Insi 


ructions 


(Sheet 1 of 7) 


Mnemonic 


Format 


Opcode 


Description 


ADDF 


F-P 


15.080 


Add F_floating 


ADDG 


F-P 


15.0A0 


Add G_floating 


ADDL 


Opr 


10.00 


Add longword 


ADDLA^ 


Opr 


10.40 


Add longword 


ADDQ 


Opr 


10.20 


Add quadword 


ADDQA^ 


Opr 


10.60 


Add quadword 


ADDS 


F-P 


16.080 


Add S_floating 


ADDT 


F-P 


16.0A0 


Add T_floating 


AMASK 


Opr 


11.61 


Determine byte/word instruction implementa- 
tion 


AND 


Opr 


11.00 


Logical product 


BEQ 


Bra 


39 


Branch if = zero 


BGE 


Bra 


3E 


Branch if > zero 


BGT 


Bra 


3F 


Branch if > zero 


BIG 


Opr 


11.0 


Bit clear 


BIS 


Opr 


11.20 


Logical sum 


BLBC 


Bra 


38 


Branch if low bit clear 


BLBS 


Bra 


3C 


Branch if low bit set 


BLE 


Bra 


3B 


Branch if < zero 


BLT 


Bra 


3A 


Branch if < zero 


BNE 


Bra 


3D 


Branch if ^ zero 


BR 


Bra 


30 


Unconditional branch 


BSR 


Mbr 


34 


Branch to subroutine 


CALL_PAL 


Pcd 


00 


Trap to PALcode 


CMOVEQ 


Opr 


11.24 


CMOVEif=zero 
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Table 16 Architecture Instructions 



(Sheet 2 of 7) 



Mnemonic 



Format Opcode Description 



CMOVGE 


Opr 


11.46 


CMOVGT 


Opr 


11.66 


CMOVLBC 


Opr 


11.16 


CMOVLBS 


Opr 


11.14 


CMOVLE 


Opr 


11.64 


CMOVLT 


Opr 


11.44 


CMOVNE 


Opr 


11.26 


CMPBGE 


Opr 


lO.OF 


CMPEQ 


Opr 


10.2D 


CMPGEQ 


F-P 


15.0A5 


CMPGLE 


F-P 


15.0A7 


CMPGLT 


F-P 


15.0A6 


CMPLE 


Opr 


10.6D 


CMPLT 


Opr 


10.4D 


CMPTEQ 


F-P 


16.0A5 


CMPTLE 


F-P 


16.0A7 


CMPTLT 


F-P 


16.0A6 


CMPTUN 


F-P 


16.0A4 


CMPULE 


Opr 


10.3D 


CMPULT 


Opr 


lO.lD 


CPYS 


F-P 


17.020 


CPYSE 


F-P 


17.022 


CPYSN 


F-P 


17.021 


CVTDG 


F-P 


15.09E 


CVTGD 


F-P 


15.0AD 


CVTGF 


F-P 


15.0AC 



CMOVEif>zero 

CMOVEif>zero 

CMOVE if low bit clear 

CMOVEiflowbitset 

CMOVE if < zero 

CMOVE if < zero 

CMOVE if ^ zero 

Compare byte 

Compare signed quadword equal 

Compare G_floating equal 

Compare G_floating less than or equal 

Compare G_floating less than 

Compare signed quadword less than or equal 

Compare signed quadword less than 

Compare T_floating equal 

Compare T_floating less than or equal 

Compare T_floating less than 

Compare T_floating unordered 

Compare unsigned quadword less than or equal 

Compare unsigned quadword less than 

Copy sign 

Copy sign and exponent 

Copy sign negate 

Convert D_floating to G_floating 

Convert G_floating to D_floating 

Convert G_floating to F_floating 
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Table 16 Architecture Instructions 



(Sheet 3 of 7) 



Mnemonic 



Format Opcode Description 



CVTGQ 


F-P 


15.0AF 


CVTLQ 


F-P 


17.010 


CVTQF 


F-P 


15.0BC 


CVTQG 


F-P 


15.0BE 


CVTQL 


F-P 


17.030 


CVTQL/SV 


F-P 


17.530 


CVTQLA^ 


F-P 


17.130 


CVTQS 


F-P 


16.0BC 


CVTQT 


F-P 


16.0BE 


CVTST 


F-P 


16.2AC 


CVTTQ 


F-P 


16.0AF 


CVTTS 


F-P 


16.0AC 


DIVF 


F-P 


15.083 


DIVG 


F-P 


15.0A3 


DIVS 


F-P 


16.083 


DIVT 


F-P 


16.0A3 


EQV 


Opr 


11.48 


EXCB 


Mfc 


18.0400 


EXTBL 


Opr 


12.06 


EXTLH 


Opr 


12.6A 


EXTLL 


Opr 


12.26 


EXTQH 


Opr 


12.7A 


EXTQL 


Opr 


12.36 


EXTWH 


Opr 


12.5A 


EXTWL 


Opr 


12.16 


FBEQ 


Bra 


31 



Convert G_floating to quadword 
Convert longword to quadword 
Convert quadword to F_floating 
Convert quadword to G_floating 
Convert quadword to longword 
Convert quadword to longword 
Convert quadword to longword 
Convert quadword to S_floating 
Convert quadword to T_floating 
Convert S_floating to T_floating 
Convert T_floating to quadword 
Convert T_floating to S_floating 
Divide F_floating 
Divide G_floating 
Divide S_floating 
Divide T_floating 
Logical equivalence 
Exception barrier 
Extract byte low 
Extract longword high 
Extract longword low 
Extract quadword high 
Extract quadword low 
Extract word high 
Extract word low 
Floating branch if = zero 
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Table 16 Architecture Instructions 



(Sheet 4 of 7) 



Mnemonic 



Format Opcode Description 



FBGE 


Bra 


36 


FBGT 


Bra 


37 


FBLE 


Bra 


33 


FELT 


Bra 


32 


FBNE 


Bra 


35 


FCMOVEQ 


F-P 


17.02A 


FCMOVGE 


F-P 


17.02D 


FCMOVGT 


F-P 


17.02F 


FCMOVLE 


F-P 


17.02E 


FCMOVLT 


F-P 


17.02C 


FCMOVNE 


F-P 


17.02B 


FETCH 


Mfc 


18.80 


FETCH_M 


Mfc 


18.A0 


IMPLVER 


Opr 


11. 6C 


INSBL 


Opr 


12.0B 


INSLH 


Opr 


12.67 


INSLL 


Opr 


12.2B 


INSQH 


Opr 


12.77 


INSQL 


Opr 


12.3B 


INSWH 


Opr 


12.57 


INSWL 


Opr 


12.1B 


JMP 


Mbr 


lA.O 


JSR 


Mbr 


lA.l 


JSR_COROUTINE 


Mbr 


1A.3 


LDA 


Mem 


08 


LDAH 


Mem 


09 



Floating branch if > zero 

Floating branch if > zero 

Floating branch if < zero 

Floating branch if < zero 

Floating branch if ^ zero 

FCMOVEif=zero 

FCMOVEif>zero 

FCMOVEif>zero 

FCMOVEif<zero 

FCMOVEif<zero 

FCMOVEif^zero 

Prefetch data 

Prefetch data, modify intent 

Determine CPU type 

Insert byte low 

Insert longword high 

Insert longword low 

Insert quadword high 

Insert quadword low 

Insert word high 

Insert word low 

Jump 

Jump to subroutine 

Jump to subroutine return 

Load address 

Load address high 
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Table 16 Architecture Instructions 



(Sheet 5 of 7) 



Mnemonic 



Format Opcode Description 



LDBU 


Mem 


OA 


LDF 


Mem 


20 


LDG 


Mem 


21 


LDL 


Mem 


28 


LDL_L 


Mem 


2A 


LDQ 


Mem 


29 


LDQ_L 


Mem 


2B 


LDQ_U 


Mem 


OB 


LDS 


Mem 


22 


LDT 


Mem 


23 


LDWU 


Mem 


OC 


MB 


Mfc 


18.4000 


MF_FPCR 


F-P 


17.025 


MSKBL 


Opr 


12.02 


MSKLH 


Opr 


12.62 


MSKLL 


Opr 


12.22 


MSKQH 


Opr 


12.72 


MSKQL 


Opr 


12.32 


MSKWH 


Opr 


12.52 


MSKWL 


Opr 


12.12 


MT_FPCR 


F-P 


17.024 


MULF 


F-P 


15.082 


MULG 


F-P 


15.0A2 


MULL 


Opr 


13.00 


MULL/V 


Opr 


13.40 


MULQ 


Opr 


13.20 



Load zero-extended byte 

Load F_floating 

Load G_floating 

Load sign-extended longword 

Load sign-extended longword locked 

Load quadword 

Load quadword locked 

Load unaligned quadword 

Load S_floating 

Load T_floating 

Load zero-extended word 

Memory barrier 

Move from floating-point control register 

Mask byte low 

Mask longword high 

Mask longword low 

Mask quadword high 

Mask quadword low 

Mask word high 

Mask word low 

Move to floating-point control register 

Multiply F_floating 

Multiply G_floating 

Multiply longword 

Multiply longword 

Multiply quadword 
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Table 16 Architecture Instructions 



(Sheet 6 of 7) 



Mnemonic 



Format Opcode Description 



MULQA^ 


Opr 


13.60 


MULS 


F-P 


16.082 


MULT 


F-P 


16.0A2 


ORNOT 


Opr 


11.28 


RC 


Mfc 


18.E0 


RET 


Mbr 


1A.2 


RPCC 


Mfc 


18.C0 


RS 


Mfc 


18.F000 


S4ADDL 


Opr 


10.02 


S4ADDQ 


Opr 


10.22 


S4SUBL 


Opr 


lO.OB 


S4SUBQ 


Opr 


10.2B 


S8ADDL 


Opr 


10.12 


S8ADDQ 


Opr 


10.32 


S8SUBL 


Opr 


lO.lB 


S8SUBQ 


Opr 


10.3B 


SEXTB 


Opr 


IC.OO 


SEXTW 


Opr 


IC.Ol 


SLL 


Opr 


12.39 


SRA 


Opr 


12.3C 


SRL 


Opr 


12.34 


STB 


Mem 


OE 


STF 


Mem 


24 


STG 


Mem 


25 


STL 


Mem 


2C 


STL C 


Mem 


2E 



Multiply quadword 

Multiply S_floating 

Multiply T_floating 

Logical sum with complement 

Read and clear 

Return from subroutine 

Read process cycle counter 

Read and set 

Scaled add long word by 4 

Scaled add quadword by 4 

Scaled subtract longword by 4 

Scaled subtract quadword by 4 

Scaled add longword by 8 

Scaled add quadword by 8 

Scaled subtract longword by 8 

Scaled subtract quadword by 8 

Store byte 

Store word 

Shift left logical 

Shift right arithmetic 

Shift right logical 

Store byte 

Store F_floating 

Store G_floating 

Store longword 

Store longword conditional 



Alpha Instruction Summary 69 



Reserved Opcodes 



Table 16 Architecture Instructions 



(Sheet 7 of 7) 



Mnemonic 


Format 


Opcode 


Description 


STQ 


Mem 


2D 


Store quadword 


STQ_C 


Mem 


2F 


Store quadword conditional 


STQ_U 


Mem 


OF 


Store unaligned quadword 


STS 


Mem 


26 


Store S_floating 


STT 


Mem 


27 


Store T_floating 


STW 


Mem 


OD 


Store word 


SUBF 


F-P 


15.081 


Subtract F_floating 


SUBG 


F-P 


15.0A1 


Subtract G_floating 


SUBL 


Opr 


10.09 


Subtract longword 


SUBLA^ 




10.49 




SUBQ 


Opr 


10.29 


Subtract quadword 


SUBQA^ 




10.69 




SUBS 


F-P 


16.081 


Subtract S_floating 


SUBT 


F-P 


16.0A1 


Subtract T_floating 


TRAPB 


Mfc 


18.00 


Trap barrier 


UMULH 


Opr 


13.30 


Unsigned multiply quadword high 


WMB 


Mfc 


18.44 


Write memory barrier 


XOR 


Opr 


11.40 


Logical difference 


ZAP 


Opr 


12.30 


Zero bytes 


ZAPNOT 


Opr 


12.31 


Zero bytes not 



10.1 Reserved Opcodes 



This section describes the opcodes that are reserved in the Alpha architecture. They 
can be reserved for DIGITAL or for PALcode. 
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10.1.1 Opcodes Reserved for DIGITAL 

Table 17 lists opcodes reserved for DIGITAL. 
Table 17 Opcodes Reserved for DIGITAL 



Mnemonic Opcode 



Mnemonic Opcode 



Mnemonic Opcode 



OPCOl 01 

OPC02 02 

OPC03 03 

OPC04 04 



OPC05 05 
OPC06 06 
OPC07 07 



OPCOA 



OA 



OPCOB 
OPCOC 
OPCOD 
OPCOE 



OB 

OC^ 
OD^ 
OE^ 



Reserved when byte/word instructions are not enabled. 

10.1.2 Opcodes Reserved for PALcode 

Table 18 lists the 21164-specific instructions. For more information, refer to the 
DIGITAL Alpha 21164 Microprocessor Hardware Reference Manual. 



Table 18 Opcodes Reserved for PALcode 



21164 Architecture 

Mnemonic Opcode Mnemonic Function 



LLD 


IB 


PAL IB 


'_ST 


IF 


PALIF 


'_REI 


IE 


PALIE 


'_MFPR 


19 


PAL 19 


' MTPR 


ID 


PAL ID 



Performs Dstream load instructions. 

Performs Dstream store instructions. 

Returns instruction flow to the program counter 
(PC) pointed to by EXC_ADDR internal processor 
register (IPR). 

Accesses the IDU, MTU, and Dcache IPRs. 

Accesses the IDU, MTU, and Dcache IPRs. 
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10.2 IEEE Floating-Point Instructions 

Table 19 lists the hexadecimal value of the 11 -bit function code field for the IEEE 
floating-point instructions, with and without qualifiers. The opcode for these instruc- 
tions is 16^5. 



Table 19 IEEE Floating-Point Instruction Function Codes 



(Sheet 1 of 2) 



Mnemonic 


None 


/c 


/M 


/D 


/U 


/uc 


/UM 


/UD 


ADDS 


080 


000 


040 


oco 


180 


100 


140 


ICO 


ADDT 


OAO 


020 


060 


OEO 


lAO 


120 


160 


lEO 


CMPTEQ 


0A5 
















CMPTLE 


0A7 
















CMPTLT 


0A6 
















CMPTUN 


0A4 
















CVTQS 


OBC 


03C 


07C 


OFC 










CVTQT 


OBE 


03E 


07E 


OFE 










CVTTS 


OAC 


02C 


06C 


OEC 


lAC 


12C 


16C 


lEC 


DIVS 


083 


003 


043 


0C3 


183 


103 


143 


1C3 


DIVT 


0A3 


023 


063 


0E3 


1A3 


123 


163 


1E3 


MULS 


082 


002 


042 


0C2 


182 


102 


142 


1C2 


MULT 


0A2 


022 


062 


0E2 


1A2 


122 


162 


1E2 


SUBS 


081 


001 


041 


OCl 


181 


101 


141 


ICl 


SUBT 


OAl 


021 


061 


OEl 


lAl 


121 


161 


lEl 


Mnemonic 


/su 


/sue 


/SUM 


/SUD 


/SUI 


/SUIC 


/SUIM 


/SUID 



ADDS 


580 


500 


540 


5C0 


780 


700 


740 


7C0 


ADDT 


SAO 


520 


560 


5E0 


7A0 


720 


760 


7E0 


CMPTEQ 


5A5 
















CMPTLE 


5A7 
















CMPTLT 


5A6 
















CMPTUN 


5A4 
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Table 19 IEEE Floating-Point Instruction Function Codes 



(Sheet 2 of 2) 



Mnemonic 



/SU 



/sue /SUM /SUD /SUI /SUIC /SUIM /SUID 



CVTQS 










7BC 


73C 


lie 


7FC 


CVTQT 










7BE 


73E 


HE 


7F3 


CVTTS 


5AC 


52C 


56C 


5EC 


7AC 


72C 


76C 


7EC 


DIVS 


583 


503 


543 


5C3 


783 


703 


743 


7C3 


DIVT 


5A3 


523 


563 


5E3 


7A3 


723 


763 


7E3 


MULS 


582 


502 


542 


5C2 


782 


702 


742 


7C2 


MULT 


5A2 


522 


562 


5E2 


7A2 


722 


762 


7E2 


SUBS 


581 


501 


541 


5C1 


781 


701 


741 


7C1 


SUBT 


5A1 


521 


561 


5E1 


7A1 


721 


761 


7E1 



Mnemonic 


None 


/S 














CVTST 


2AC 


6AC 














Mnemonic 


None 


/C 


/V 


/VC 


/SV 


/SVC 


/SVI 


/SVIC 


CVTTQ 


OAF 


02F 


lAF 


12F 


5AF 


52F 


7AF 


72F 


Mnemonic 


D 


/VD 


/SVD 


/SVID 


/M 


/VM 


/SVM 


/SVIM 



CVTTQ 



OFF 



IFF 



5FF 



7FF 



06F 



16F 



56F 



76F 



Note: Because underflow cannot occur for CMPTxx, there is no difference in 

function or performance between CMPTxx/S and CMPTxx/SU. It is 
intended that software generate CMPTxx/SU in place of CMPTxx/S. 

In the same manner, CVTQS and CVTQT can take an inexact result 
trap, but not an underflow. Because there is no encoding for a CVTQx/ 
SI instruction, it is intended that software generate CVTQx/SUI in place 
ofCVTQx/SI. 
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10.3 VAX Floating-Point Instructions 

Table 20 lists the hexadecimal value of the 11 -bit function code field for the VAX 
floating-point instructions. The opcode for these instructions is IS^g. 

Table 20 VAX Floating-Point Instruction Function Codes 



Mnemonic 


None 


/C 


/U 


/UC 


/S 


/SC 


/SU 


/sue 


ADDF 


080 


000 


180 


100 


480 


400 


580 


soo 


ADDG 


OAO 


020 


lAO 


120 


4A0 


420 


SAO 


S20 


CMPGEQ 


0A5 








4A5 








CMPGLE 


0A7 








4A7 








CMPGLT 


0A6 








4A6 








CVTDG 


09E 


OlE 


19E 


HE 


49E 


41E 


59E 


SIE 


CVTGD 


OAD 


02D 


IAD 


12D 


4AD 


42D 


SAD 


S2D 


CVTGF 


OAC 


02C 


lAC 


12C 


4AC 


42C 


SAC 


S2C 


CVTQF 


OBC 


03C 














CVTQG 


ORE 


03E 














DIVF 


083 


003 


183 


103 


483 


403 


S83 


S03 


DIVG 


0A3 


023 


1A3 


123 


4A3 


423 


SA3 


S23 


MULF 


082 


002 


182 


102 


482 


402 


S82 


S02 


MULG 


0A2 


022 


1A2 


122 


4A2 


422 


SA2 


S22 


SUBF 


081 


001 


181 


101 


481 


401 


581 


501 


SUBG 


OAl 


021 


lAl 


121 


4A1 


421 


5A1 


521 


Mnemonic 


None 


/C 


/V 


/vc 


IS 


/SC 


/SV 


/SVC 


CVTGQ 


OAF 


02F 


lAF 


12F 


4AF 


42F 


5AF 


52F 
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10.4 Opcode Summary 

Table 21 lists all Alpha opcodes from 00 (CALL_PAL) through 3F (BGT). In the 
table, the column headings that appear over the instructions have a granularity of 8^5. 
The rows beneath the Offset column supply the individual hexadecimal number to 
resolve that granularity. 

If an instruction column has a in the right (low) hexadecimal digit, replace that 
with the number to the left of the backslash (\) in the Offset column on the instruc- 
tion's row. If an instruction column has an 8 in the right (low) hexadecimal digit, 
replace that 8 with the number to the right of the backslash in the Offset column. 

For example, the third row (2/A) under the 10^ 5 column contains the symbol INTS*, 
representing the all-integer shift instructions. The opcode for those instructions 
would then be 12^5 because the in 10 is replaced by the 2 in the Offset column. 
Likewise, the third row under the 18^5 column contains the symbol JSR*, represent- 
ing all jump instructions. The opcode for those instructions is lA because the 8 in the 
heading is replaced by the number to the right of the backslash in the Offset column. 
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The instruction format is listed under the instruction symbol. 
Table 21 Opcode Summary 



Offset 


00 


08 


10 


18 


20 


28 


30 


38 


0/8 


PAL* 


LDA 


INTA* 


MISC* 


LDF 


LDL 


BR 


BLBC 




(pal) 


(mem) 


(op) 


(mem) 


(mem) 


(mem) 


(br) 


(br) 


1/9 


Res 


LDAH 


INTL* 


\PAL\ 


LDG 


LDQ 


FBEQ 


BEQ 






(mem) 


(op) 




(mem) 


(mem) 


(br) 


(br) 


2/A 


Res 


LDBU 


INTS* 


JSR* 


LDS 


LDL_L 


FBLT 


BLT 






(mem) 


(op) 


(mem) 


(mem) 


(mem) 


(br) 


(br) 


3/B 


Res 


LDQ_U 


INTM* 


\PAL\ 


LDT 


LDQ_L 


FBLE 


BLE 






(mem) 


(op) 




(mem) 


(mem) 


(br) 


(br) 


4/C 


Res 


LDWU 


Res 


SEXT* 


STF 


STL 


BSR 


BLBS 






(mem) 




(op) 


(mem) 


(mem) 


(br) 


(br) 


5/D 


Res 


STW 


FLTV* 


\PAL\ 


STG 


STQ 


FBNE 


BNE 






(mem) 


(op) 




(mem) 


(mem) 


(br) 


(br) 


6/E 


Res 


STB 


FLTI* 


\PAL\ 


STS 


STL_C 


FBGE 


BGE 






(mem) 


(op) 




(mem) 


(mem) 


(br) 


(br) 


7/F 


Res 


STQ_U 


FLTL* 


\PAL\ 


STT 


STQ_C 


FBGT 


BGT 






(mem) 


(op) 




(mem) 


(mem) 


(br) 


(br) 


Symbol 

FLTI* 

FLTL* 

FLTV* 

INTA* 

INTL* 

INTM* 

INTS* 

JSR* 

MISC* 




Meaning 

IEEE floating-point instruction opcodes 
Floating-point operate instruction opcodes 
VAX floating-point instruction opcodes 
Integer arithmetic instruction opcodes 
Integer logical instruction opcodes 
Integer multiply instruction opcodes 
Integer shift instruction opcodes 
Jump instruction opcodes 
Miscellaneous instruction opcodes 








PAL* 
\PAL\ 




PALcode instruction (CALL_PAL) opcodes 
Reserved for PALcode 








Res 




Reserved for DIGITAL 










SEXT* 




Sign extend opcodes 
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10.5 Required PALcode Function Codes 

The opcodes listed in Table 22 are required for all Alpha implementations. The nota- 
tion used is oo.ffff, where oo is the hexadecimal 6-bit opcode andj^^^is the hexadeci- 
mal 26-bit function code. 

Table 22 Required PALcode Function Codes 
IVInemonic Type Function Code 

DRAINA Privileged 00.0002 

HALT Privileged 00.0000 

1MB Unprivileged 00.0086 
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11 Electrical Data 

This section describes the electrical characteristics of the 21164 component and its 
interface pins. It is organized as follows: 

Electrical characteristics 

DC characteristics 

Clocking scheme 

AC characteristics 

Power supply considerations 

11.1 Electrical Characteristics 

Table 23 lists the maximum ratings for the 21164 and Table 24 lists the operating 
voltages. 

Table 23 21164 Absolute Maximum Ratings 
Characteristics Ratings 

Storage temperature -55°C tol25°C (-67°F to 257°F) 

Junction temperature 15°C to 90°C (59°F to 194°F) 

Supply voltage Vss = -0.5 V, Vddi = 2.5 V, Vdd = 3.3 V 

Signal input or output applied -0.5 V to 4.6 V 

Typical Vdd worst case power @ Vdd = 3.3 V 

Frequency = 366 MHz 3.0 W 

For frequencies greater than 366 MHz, add 0,5 W for each 133 MHz. 

Typical Vddi worst case power @ Vddi = 2.5 V 

Frequency = 366 MHz 27.5 W 

For frequencies greater than 366 MHz, add 5,0 W for each 66 MHz. 



78 Electrical Data 



DC Characteristics 



Caution: Stress beyond the absolute maximum rating can cause permanent dam- 
age to the 21164. Exposure to absolute maximum rating conditions for 
extended periods of time can affect the 21164 reliability. 

Table 24 Operating Voltages 

Nominal Maximum Minimum 

Vdd Vddi Vdd Vddi Vdd Vddi 

3.3 V 2.5 V 3.46 V 2.6 V 3.13 V 2.4 V 

11.2 DC Characteristics 

The 21164 is designed to run in a 3.3-V CMOS/TTL environment. The 21164 is 
tested and characterized in a CMOS environment. 

11.2.1 Power Supply 

The Vss pins are connected to 0.0 V, the Vddi pins are connected to 2.5 V ±0.1 V, 
and the Vdd pins are connected to 3.3 V ±5%. 

11.2.2 Input Signal Pins 

Nearly all input signals are ordinary CMOS inputs with standard TTL levels (see 
Table 25). (See Section 11.3.1 for a description of an exception — osc_clk_in_h,l.) 

After power has been applied, input and bidirectional pins can be driven to a maxi- 
mum dc voltage of Vclamp at a maximum current of Iclamp without harming the 
21164. Refer to Table 25 for Vclamp and Iclamp values. Inputs greater than 
Vclamp will be clamped to Vclamp provided that the current does not exceed 
Iclamp. The 21164 may be damaged if the voltage exceeds Vclamp or the current 
exceeds Iclamp. 

11.2.3 Output Signal Pins 

Output pins are ordinary 3.3-V CMOS outputs. Although output signals are rail-to- 
rail, timing is specified to Vdd/2. 

Note: The 21164 microprocessor chips do not have an onchip resistor for an 

output driver. Earlier versions of the 21164 have a 30-Q. (typical) onchip 
resistor for an output driver. 
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Bidirectional pins are either input or output pins, depending on control timing. When 
functioning as output pins, they are ordinary 3.3-V CMOS outputs. 

Table 25 shows the CMOS dc input and output pins. 



Table 25 C 


MOS DC Input/Output Charactei 


ristics 






(Sheet 1 of 2) 




Parameter 


Requirements 
Min. Max. 


Units 




Symbol 


Description 


Test Conditions 


Vih 


High-level input voltage 


2.0 


— 


V 


— 


Vil 


Low-level input voltage 


— 


0.8 


V 


— 


Voh 


High-level output voltage 


2.4 


— 


V 


loh = -6.0 mA 


Vol 


Low-level output voltage 


— 


0.4 


V 


lol = 6.0 mA 


liLpd 


Input with pull-down leakage 
current 


— 


+50 


^lA 


Vin = V 


Iih_pd 


Input with pull-down current 


— 


200 


^lA 


Vin = 2.4 V 


Iil_pu 


Input with pull-up current 


— 


-800 


^lA 


Vin = 0.4 V 


Iih_pu 


Input with pull-up leakage cur- 
rent 


— 


±50 


liA 


Vin = Vdd V 


Iozl_pd 


Output with pull-down leak- 
age current (tristate) 


— 


+100 


liA 


Vin = V 


Iozh_pd 


Output with pull-down current 
(tristate) 


— 


300^ 


^lA 


Vin = 2.4 V 


Iozl_pu 


Output with pull-up current 
(tristate) 


— 


-800 


^lA 


Vin = 0.4 V 


Iozh_pu 


Output with pull-up leakage 
current (tristate) 


— 


+100 


^lA 


Vin = Vdd V 


Vclamp 


Maximum clamping voltage 


— 


Vdd+1.0 


V 


Iclamp =100 mA 
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Table 25 CI 


UIOS DC Input/Output Chars 


icteristics (Sheet 2 of 2) 




Parameter 


Requirements 


Symbol 


Description 


Min. Max. Units Test Conditions 



Idd 



Peak power supply current for — 1.3 

Vdd power supply 



Vdd = 3.465 V 
Frequency = 366 MHz 



For frequencies greater than 366 MHz, 
add 0.4 A for each 133 MHz. 



Iddi 



Peak power supply current for — 13.8 

Vddi power supply 



Vddi = 2.6 V 

Frequency = 366 MHz 



For frequencies greater than 366 MHz, 
add 2,4 A for each 66 MHz. 

For chip speeds greater than 500 MHz, the maximum Iozh_pd is 500 jiA. 
' This assumes sysclk ratio of 3 and worst case loading of output pins. 

Most pins have low current pull-down devices to Vss. However, two pins have a 
pull-up device to Vdd. The pull-downs (or pull-ups) are always enabled. This means 
that some current will flow from the 21164 (if the pin has a pull-up device) or into 
the 21164 (if the pin has a pull-down device) even when the pin is in the high-imped- 
ance state. All pins have pull-down devices, except for the pins in the following 
table: 



Signal Name 



Notes 



tms_h 

tdi_h 

osc_clk_in_h 

osc_clk_in_l 

temp_sense 



Has a pull-up device 

Has a pull-up device 

50 Q. to Vterm (^ Vdd/2) (See Figure 13) 

50 Q. to Vterm (^ Vdd/2) (See Figure 13) 

150 ^ to Vss 
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11.3 Clocking Scheme 

Note: The preferred clock mode of the 21164 is Ix. This is a change from the 

earlier versions of the 21 164, which had a preferred clock mode of 2x. 
Refer to Section 11.4.8 for more details. 

The differential input clock signals osc_clk_in_h,l run at the internal frequency of 
the time base for the 21164. The output signal cpu_clk_out_h toggles with an 
unspecified propagation delay relative to the transitions on osc_clk_in_h,l. 

System designers have a choice of two system clocking schemes to run the 21164 
synchronous to the system: 

1. The 21164 generates and drives out a system clock, sys_clk_outl_h,l. It runs 
synchronous to the internal clock at a selected ratio of the internal clock fre- 
quency. There is a small clock skew between the internal clock and 
sys_clk_outl_h,l. 

2. The 21164 synchronizes to a system clock, ref_clk_in_h, supplied by the sys- 
tem. The ref_clk_in_h clock runs at a selected ratio of the 21164 internal clock 
frequency. The internal clock is synchronized to the reference clock by an onchip 
digital phase-locked loop (DPLL). 

11.3.1 Input Clocks 

The differential input clocks osc_clk_in_h,l provide the time base for the chip when 
dc_ok_h is asserted. These pins are self-biasing, and must be capacitively coupled to 
the clock source on the module. 

Note: It is not desirable to drive the osc_clk_in_h,l pins directly. This is a 

change from earlier versions of the 21164. 

The terminations on these signals are designed to be compatible with system oscilla- 
tors of arbitrary dc bias. The oscillator must have a duty cycle of 60%/40% or tighter. 
Figure 13 shows the input network and the schematic equivalent of osc_clk_in_h,l 
terminations. 
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Figure 13 osc_clk_in_h,l Input Network and Terminations 
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Note: 

* Coupling capacitors 47 pF to 220 pF 

Ring Oscillator 

When signal dc_ok_h is deasserted, the clock outputs follow the internal ring oscil- 
lator. The 21 164 runs off the ring oscillator, just as it would when an external clock is 
applied. The frequency of the ring oscillator varies from chip to chip within a range 
of 10 MHz to 100 MHz. This corresponds to an internal CPU clock frequency range 
of 5 MHz to 50 MHz. The system clock divisor is forced to 8, and the sys_clk_out2 
delay is forced to 3. 

Clock Sniffer 

A special onchip circuit monitors the osc_clk_in pins and detects when input clocks 
are not present. When activated, this circuit switches the 21164 clock generator from 
the osc_clk_in pins to the internal ring oscillator. This happens independently of the 
state of the dc_ok_h pin. The dc_ok_h pin functions normally if clocks are present 
on the osc_clk_in pins. 
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11.3.2 Clock Termination and Impedance Levels 

In Figure 13, the clock is designed to approximate a 50-fl termination for the pur- 
pose of impedance matching for those systems that drive input clocks across long 
traces. The clock input pins appear as a 50-Q. series termination resistor connected to 
a high impedance voltage source. The voltage source produces a nominal voltage 
value of Vdd/2. The source has an impedance of between 130 Q. and 600 Q.. This 
voltage is called the self-bias voltage and sources current when the applied voltage at 
the clock input pins is less than the self-bias voltage. It sinks current when the 
applied voltage exceeds the self-bias voltage. This high impedance bias driver allows 
a clock source of arbitrary dc bias to be ac coupled to the 21164. The peak-to-peak 
amplitude of the clock source must be between 0.6 V and 3.0 V. Either a square- 
wave or a sinusoidal source may be used. Full-rail clocks may be driven by testers. 
In any case, the oscillator should be ac coupled to the osc_clk_in_h,l inputs by 
47-pF through 220-pF capacitors. 

Figure 14 shows a plot of the simulated impedance versus the clock input frequency. 
Figure 13 is a simplified circuit of the complex model used to create Figure 14. 



Figure 14 Impedance vs Clock Input Frequency 
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11.3.3 AC Coupling 

Using series coupling (blocking) capacitors renders the 21164 clock input pins insen- 
sitive to the oscillator's dc level. When connected this way, oscillators with any dc 
offset relative to Vss can be used provided they can drive a signal into the 
osc_clk_in_h,l pins with a peak-to-peak level of at least 600 mV, but no greater than 
3.0 V peak-to-peak. 

The value of the coupling capacitor is not overly critical. However, it should be suf- 
ficiently low impedance at the clock frequency so that the oscillator's output signal 
(when measured at the osc_clk_in_h,l pins) is not attenuated below the 600-mV, 
peak-to-peak lower limit. For sine waves or oscillators producing nearly sinusoidal 
(pseudo square wave) outputs, 220 pF is recommended at 433 MHz. A high-quality 
dielectric such as NPO is required to avoid dielectric losses. 

Table 26 shows the input clock specification. 

Table 26 Input Clock Specification 

Signal Parameter Nominal Bin^ Unit 

osc_clk_in_h4 symmetry 50+10 % 

osc_clk_in_h,l minimum voltage 0.6 V (peak-to-peak) 

osc_clk_in_h,l Z input 50 Q. 

^ Minimum clock frequency = 300 MHz for devices < 433 MHz 
Minimum clock frequency = 440 MHz for devices > 466 MHz 
Maximum clock frequency = 600 MHz = 1/Tcycle 

11.4 AC Characteristics 

This section describes the ac timing specifications for the 21164. 

11.4.1 Test Configuration 

All input timing is specified relative to the crossing of standard TTL input levels of 
0.8 V and 2.0 V. Output timing is to the nominal CMOS switch point of Vdd/2 (see 
Figure 15). 

Because the speed and complexity of microprocessors has increased substantially 
over the years, it is necessary to change the way they are tested. Traditional assump- 
tions that all loads can be lumped into some accumulation of capacitance cannot be 
employed any more. Rather, the model of a transmission line with discrete loads is a 
much more realistic approach for current test technology. 
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Figure 15 Input/Output Pin Timing 

Tcycle 



Internal 
CPU Clock 




Input 
Signals 




Vdd 



Vss 



Input Timing 



Internal 
CPU Clock 



Output 
Signals 



50% 




Vdd 
2 



Output Timing 



Vdd 
Vss 

MK-1455-19 



Typically, printed circuit board (PCB) etch has a characteristic impedance of approx- 
imately 75 Q,. This may vary from 60 Q,to90 Q. with tolerances. If the line is driven 
in the electrical center, the load could be as low as 30 Q.. Therefore, a characteristic 
impedance range of 30 Q.io90 Q. could be experienced. 

The 21164 output drivers are designed with typical printed circuit board applications 
in mind rather than trying to accommodate a 40-pF test load specification. As such, it 
"launches" a voltage step into a characteristic impedance, ranging from 30 Q. to 
90 a 

There is no source termination resistor in the 21164 fabricated in 0.35-|im CMOS 
process technology. The source impedance of the driver is approximately 32 Q. ±17. 
The circuit is designed to deliver a TTL signal under worst case conditions. Under 
light load, high drive voltages, and fast process conditions there may be considerable 
overdrive. It may be necessary to install termination or clamping elements to the sig- 
nal etches or loads. 
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11.4.2 Pin Timing 

The following sections describe Bcache loop timing, sys_clk-based system timing, 
and reference clock-based system timing. 

11.4.2.1 Backup Cache Loop Timing 

The 21164 can be configured to support an optional offchip backup cache (Bcache). 
Private Bcache read or write (Scache victims) transactions initiated by the 21164 are 
independent of the system clocking scheme. Bcache loop timing must be an integer 
multiple of the 21164 cycle time. 

Table 27 lists the Bcache loop timing. 



Table 27 Bcache Loop Timing 








Value 




Signal Specification 


366 MHz - 500 MHz Faster than 500 MHz 


Name 



data_h<127:0> 

data_h<127:0> 

data_h<127:0> 

data_h<127:0> 

index_h<25:4>, 
st_clkl_h, st_clk2_h^ 

index_h<25:4>, 
st_clkl_h, st_clk2_h^ 



Input setup 
Input hold 
Output delay 
Output hold 
Output delay 



1.2 ns 1.1ns Tdsu 

0.0 ns -0.1ns Tdh 

Tdd + Tcycle + 0.4 ns^ Tdd + Tcycle + 0.2 ns^ Tdod 

Tmdd + Tcycle Tmdd + Tcycle Tdoh 

Tbedd + 0.4 ns, Tbedd + 0.2 ns, Tiod 

or Tbddd + 0.4 ns^'"^ or Tbddd + 0.2 ns^'"^ 



Output hold time Tmdd 



Tmdd 



Tioh 



The value 0.4 ns accounts for onchip driver and clock skew. 
The value 0.2 ns accounts for onchip driver and clock skew. 
^ See 21164 change document for the positioning of st_clkl_h and st_clk2_h with respect to the Bcache index 
pins. 
For big drive enabled or big drive disabled, respectively. See Table 29. 

Outgoing Bcache index and data signals are driven off the internal clock edge and 
the incoming Bcache tag and data signals are latched on the same internal clock 
edge. Table 28 and Table 29 show the output driver characteristics for the normal 
driver and big driver respectively. 
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Additional drive for the following pins can be enabled by connecting big_drv_en_h 
to Vdd: 

• index_h<25:4> 

• tag_ram_oe_h, tag_ram_we_h 

• data_ram_oe_h, data_ram_we_h 

• st_clkl_h, st_clk2_h 

If any of the previous pins are connected to lightly loaded lines (less than 40 pF), 
additional drive should not be enabled or the lines should be properly terminated to 
avoid transmission line ringing. 

Table 28 Normal Output Driver Characteristics 



Specification 




40-pF Load 


10-pFLoad 


Name 


Maximum driver delay 
Minimum driver delay 




2.7 ns 
1.0 ns 


1.6 ns 

1.0 ns (0.6 ns^) 


Tdd 
Tmdd 


For chip speeds greater than 500 MHz, the minimum delay is 0.6 

Table 29 Big Output Driver Characteristics 


ns. 




Specification 


60-pF Load 


40-pF Load 


10-pFLoad 


Name 


Extra Drive Disabled 


Maximum driver delay 
Minimum driver delay 




2.8 ns 
1.0 ns 


1.7 ns 

1.0 ns (0.6 ns^) 


Tbddd 
Tmdd 


Extra Drive Enabled 


Maximum driver delay 
Minimum driver delay 


2.7 ns 
1.0 ns 


2.2 ns 
1.0 ns 


1.7 ns 

1.0 ns (0.6 ns^) 


Tbedd 
Tmdd 



NA = Not applicable. 

For chip speeds greater than 500 MHz, the minimum delay is 0.6 ns. 

Output pin timing is specified for lumped 40-pF and 10- pF loads for the normal 
driver and lumped 60-pF, 40-pF, and 10-pF loads for the big driver. In some cases, 
the circuit may have loads higher than 40 pF (60 pF for big driver). The 21164 can 
safely drive higher loads provided the average charging or discharging current from 
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each pin is 1 1 mA or less for normal output drivers or 25 mA or less for big output 
drivers. The following equation can be used to determine the maximum capacitance 
that can be safely driven by each pin: 

• For normal output drivers: C^ax (i^ P^) - ^t' where t is the waveform period 
(measured from rising to rising or falling to falling edge), in nanoseconds. 

• For big output drivers: C^ax (i^ P^) - ^^' where t is the waveform period (mea- 
sured from rising to rising or falling to falling edge), in nanoseconds. 

For example, if the waveform appearing on a given normal I/O pin has a 15.0-ns 
period, it can safely drive up to and including 75 pF. 

Figure 16 shows the Bcache read and write timing. 
Figure 16 Bcache Timing 
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11.4.2.2 sys_clk-Based Systems 

All timing is specified relative to the rising edge of the internal CPU clock. 

Table 30 shows 21164 system clock sys_clk_outl_h,l output timing. Setup and hold 
times are specified independent of the relative capacitive loading of 
sys_clk_outl_h,l, addr_h<39:4>, data_h<127:0>, and cmd_h<3:0> signals. The 
ref_clk_in_h signal must be tied to Vdd for proper operation. 
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Table 30 21164 System Clock Output Timing (sysclk=T(^) 





Specification 


Value 




Signal 


366 MHz - 500 MHz 


Faster than 500 MHz 


Name 


data_bus_req_h, 
data_h<127:0>, 


Input setup 


1.2 ns 


1.1ns 


Tdsu 


addr_h<39:4> 










data_bus_req_h, 
data_h<127:0>, 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Troh 


addr_h<39:4> 










addr_h<39:4> 


Output delay 


Tdd + 0.5 X Tcycle 

+ 0.9 ns^ 


Tdd + 0.5 X Tcycle + 

0.7 ns^ 


Traod 


addr_h<39:4> 


Output hold time 


Tmdd 


Tmdd^ 


Traoh 


data_h<127:0> 


Output delay 


Tdd +1.5 + Tcycle 

+ 0.9 ns^ 


Tdd+ 1.5 + Tcycle + 
0.7 ns^ 


Trdod"^ 


data_h<127:0> 


Output hold time 


Tmdd + Tcycle 


Tmdd ^+ Tcycle 


Trdoh^ 


Non-Pipe_Latch Mode 


addr_bus_req_h 


Input setup 


3.4 ns 


3.4 ns 


Tntrabrsu 


addr_bus_req_h 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Tntrabrh 


dack_h 


Input setup 


3.2 ns 


3.2 ns 


Tntracksu 


cack_h 


Input setup 


3.4 ns 


3.4 ns 


Tntrcacksu 


cack_h, dack_h 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Tntrackh 


Pipe_Latch Mode^ 



addr_bus_req_h, 
cack_h9 dack_h 

addr_bus_req_h, 
cack_h9 dack_h 



Input setup 



Input hold 



1.2 ns 



0.5 X Tcycle 



1.1ns 



0.5 X Tcycle 



^ The value 0.9 ns accounts for onchip skews that include 0.4 ns for driver and clock skew, phase 
due to circuit delay (0.2 ns), and delay in ref_clk_in_h due to the package (0.3 ns). 
The value 0.7 ns accounts for onchip skews that include 0.2 ns for driver and clock skew, phase 
due to circuit delay (0.2 ns), and delay in ref_clk_in_h due to the package (0.3 ns). 
For chip speeds greater than 500 MHz, Tmdd is 0.6 ns. 

For all write transactions initiated by the 21164, data is driven one CPU cycle later. 
In pipejatch mode, control signals are piped onchip for one sys_clk_outl_h,l before usage. 



Ttracksu 

Ttrackh 

detector skews 
detector skews 
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Figure 17 shows sys_clk system timing. 
Figure 17 sys_clk System Timing 
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11.4.2.3 Reference Clock-Based Systems 

Systems that generate their own system clock expect the 21164 to synchronize its 
sys_clk_outl_h,l outputs to their system clock. The 21164 uses a digital phase- 
locked loop (DPLL) to synchronize its sys_clk_outl signals to the system clock that 
is applied to the ref_clk_in_h signal. 
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Table 31 shows all timing relative to the rising edge of ref_clk_in_h. 
Table 31 21 1 64 Reference Clock Input Timing 





Specification 


Value 




Signal 


366 MHz - 500 MHz 


Faster than 500 MHz 


Name 


data_bus_req_h, 
data_h<127:0>, 


Input setup 


1.2 ns 


1.1ns 


Tdsu 


addr_h<39:4> 










data_bus_req_h, 
data_h<127:0>, 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Troh 


addr_h<39:4> 










addr_h<39:4> 


Output delay 


Tdd + 0.5 X Tcycle 

+ 0.9 ns^ 


Tdd + 0.5 X Tcycle + 

0.7 ns^ 


Traod 


addr_h<39:4> 


Output hold time 


Tmdd 


Tmdd^ 


Traoh 


data_h<127:0> 


Output delay 


Tdd +1.5 + Tcycle 

+ 0.9 ns^ 


Tdd+ 1.5 + Tcycle + 
0.7 ns^ 


Trdod"^ 


data_h<127:0> 


Output hold time 


Tmdd + Tcycle 


Tmdd ^+ Tcycle 


Trdoh^ 


Non-Pipe_Latch Mode 


addr_bus_req_h 


Input setup 


3.4 ns 


3.4 ns 


Tntrabrsu 


addr_bus_req_h 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Tntrabrh 


dack_h 


Input setup 


3.2 ns 


3.2 ns 


Tntracksu 


cack_h 


Input setup 


3.4 ns 


3.4 ns 


Tntrcacksu 


cack_h, dack_h 


Input hold 


0.5 X Tcycle 


0.5 X Tcycle 


Tntrackh 


Pipe_Latch Mode^ 



addr_bus_req_h, 
cack_h9 dack_h 

addr_bus_req_h, 
cack_h9 dack_h 



Input setup 



Input hold 



1.2 ns 



0.5 X Tcycle 



1.1ns 



0.5 X Tcycle 



^ The value 0.9 ns accounts for onchip skews that include 0.4 ns for driver and clock skew, phase 
due to circuit delay (0.2 ns), and delay in ref_clk_in_h due to the package (0.3 ns). 
The value 0.7 ns accounts for onchip skews that include 0.2 ns for driver and clock skew, phase 
due to circuit delay (0.2 ns), and delay in ref_clk_in_h due to the package (0.3 ns). 
For chip speeds greater than 500 MHz, Tmdd is 0.6 ns. 

For all write transactions initiated by the 21164, data is driven one CPU cycle later. 
In pipejatch mode, control signals are piped onchip for one sys_clk_outl_h,l before usage. 



Ttracksu 

Ttrackh 

detector skews 
detector skews 
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11.4.3 Digital Phase-Locked Loop 

Figure 18 and Table 32 describe the digital phase-locked loop (DPLL) stages of 
operation. 

Figure 18 ref_clk System Timing 

Relationship of CPU Clock and ref_clk_in 




Relationship of CPU Clock, ref_clk_in and sys_clk_out1 



CPU Clock 

ref_clk_in 
sys_clk_out1 




Tsysd 



Tsysd 



Tsysd 



Table 32 describes the callouts shown in Figure li 
Table 32 ref_clk System Timing Stages 



Stage Description 



1 The internal CPU clock rising edge coincides with the rising edge of ref_clk_in_h. 

2 The DPLL causes the internal CPU clock to stretch for one phase (1 cycle of 
osc_clk_in_h,l). 

3 The stretch causes ref_clk_in_h to lead the internal CPU clock by one phase. 

4 The CPU clock is always slightly faster than the external ref_clk_in_h and gains 
on ref_clk_in_h over time. Eventually the gain equals one phase and a new stretch 
phase follows. 
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Although systems that supply a ref_clk_in_h do not use sys_clk_outl_h,l, a rela- 
tionship between the two signals exists, just as in the sys_clk-based systems, because 
the 21164 uses sys_clk_outl_h5l internally to determine timing during system trans- 
actions. 

11.4.4 Timing — Additional Signals 

This section lists timing for all other signals. 
11.4.4.1 Asynchronous Input Signals 

The following is a list of the asynchronous input signals: 

clk_mode_h<2:0> dc_ok_h ref_clk_in_h sys_reset_l^ 

oe_we_active_low_h perf_mon_h^ big_drv_en_h irq_h<3:0>^ 

mch_hlt_irq_h^ pwr_fail_irq_h^ sys_mch_chk_irq_h^ 

These signals can also be used synchronously. 
Signal sys_reset_l may be deasserted synchronously. 
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11.4.4.2 Miscellaneous Signals 

Table 33 and Table 34 list the timing for miscellaneous input-only and output-only 
signals. All timing is expressed in nanoseconds. 

Table 33 Input Timing for sys_clk_out- or ref_clk_in-Based Systems 

Value Name 



Signal Specification sys_clk_out ref_clk_in sys_clk_out ref_clk_in 

cfail_h, fill_h, fill_error_h9 fill_id_h, Input setup 1.2 ns 1.2 ns Tdsu Tdsu 

fill_nocheck_h, idle_bc_h, (1.1 ns^) (1.1 ns^) 

shared_h9 system_lock_flag_h 

irq_h<3:0>, mch_hlt_irq_h, 
pwr_fail_irq_h, sys_mch_chk_irq_h 

Testability pins: 
port_mode_h, srom_data_h9 
srom_present_l 



cfail_h, fill_h, fill_error_h, fill_id_h, Input hold ns 0.5 x Tdh Troh 

fill_nocheck_h, idle_bc_h, (-0.1 ns^) Tcycle 

shared_h, system_lock_flag_h 



irq_h<3:0>, mch_hlt_irq_h, 
pwr_fail_irq_h, sys_mch_chk_irq_h 



sys_reset_l 

Testability pins: 
port_mode_h, srom_data_h9 
srom_present_l 



^ For chip speeds greater than 500 MHz. 
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Table 34 Output Timing for sys_clk_out- or ref_clk_in-Based Systems 



(Sheet 1 of 2) 



Clocking System Value 


Clocking System Name 


Signal Specification sys_clk_out ref_clk_in 


sys_clk_out ref_clk_in 


Unidirectional Signals 



addr_res_h, 


Output delay 


Tdd + 0.4 ns 


Tdd + 


int4_valid_h,^ 




(Tdd + 0.2 ns^) 


0.5 xTcycle + 0.9 


scache_set_h, 






ns 


srom_clk_h, 






(Tdd + 


srom_oe_l, 






0.5xTcycle + 0.7 


victim_pending_h 






ns^) 


addr_res_h, 


Output hold 


Tmdd 


Tmdd^ 


int4_valid_h,^ 








scache_set_h, 








srom_clk_h, 








srom_oe_l, 








victim_pending_h 








int4_valid_h'^ 


Output delay 


Tdd + Tcycle + 0.4 ns 


Tdd + 






(Tdd + Tcycle 


1.5xTcycle + 0.9 






+ 0.2 ns^) 


ns 
(Tdd + 

1.5xTcycle + 0.7 
ns^) 


int4_valid_h'^ 


Output hold 


Tmdd + Tcycle 


Tmdd^+ Tcycle 



Taod 



Traod 



Taoh 



Traoh 



Tdod 



Trdod 



Tdoh 



Trdoh 



Bidirectional Signals 



Input mode: 

addr_cmd_par_h, Input setup 1.2 ns 



(1.1 ns^) 



1.2 ns 
(1.1 ns^) 



Tdsu 



Tdsu 



cmd_h, 

data_check_h,^ 
tag_ctl_par_h, ^ 
tag_dirty_h, 
tag_shared_h^ 



addr_cmd_par_h, Input hold ns 



0.5 xTcycle 



Tdh 



Tsdadh 



(-0.1 ns^) 



cmd_h, 

data_check_h,^ 

tag_ctl_par_h,^ 

tag_dirty_h, 

tag_shared_h^ 



96 



Electrical Data 



AC Characteristics 



Table 34 Oui 


tput Timing for sys_clk_out- or ref_clk_in-Based Syi 


Stems (Sheet 2 of 2) 




Clocking System Value 


Clocking System Name 


Signal 


Specification sys_clk_out ref_clk_in 


sys_clk_out ref_clk_in 



Output mode: 

addr_cmd_par_h9 Output delay Tdd + 0.4 ns 



cmd_h, 

tag_ctl_par_h,' 

tag_dirty_h,^ 

tag_shared_h,^ 

tag_valid_h^ 



(Tdd + 0.2 ns^) 



Tdd + 

0.5 xTcycle + 0.9 

ns 
(Tdd + 

0.5xTcycle + 0.7 
ns^) 



Taod 



Traod 



data_check_h'^ Output delay Tdd + Tcycle + 0.4 ns 

(Tdd + Tcycle 

+ 0.2 ns^) 



Tdd + 

1.5xTcycle + 0.9 

ns 

(Tdd + 

1.5xTcycle + 0.7 
ns^) 



Tdod 



Trdod 



addr_cmd_par_h, Output hold Tmdd 

cmd_h, 

tag_ctl_par_h, 
tag_dirty_h,^ 
tag_shared_h,^ 
tag_valid_h^ 



Tmdd^ 



Taoh 



Traoh 



data_check_h'^ Output hold Tmdd + Tcycle 



Tmdd^+ Tcycle Tdoh 



Trdoh 



^ Read transaction. 
For chip speeds greater than 500 MHz. 
For chip speeds greater than 500 MHz, Tmdd is 0.6 ns. 
Write transaction. 
Fills from memory. 
Only for write broadcasts and system transactions. 
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Signals in Table 35 are used to control Bcache data transfers. These signals are 
driven off the CPU clock. The choice of sys_clk_out or ref_clk_in has no impact on 
the timing of these signals. 



Table 35 


Bcache Control Signal Timing 










Specification 




Value 




Signal 


366 MHz- 


-500 MHz Faster than 500 MHz 


Name 



Input mode: 

tag_data_h, tag_data_par_h, 
tag_valid_h 

tag_data_h, tag_data_par_h, 
tag_valid_h 

Output mode: 

data_ram_oe_h, 
data_ram_we_h, ^ 
tag_ram_oe_h, tag_ram_we_h ^ 

tag_data_h, tag_data_par_h, 
tag_valid_h 



Input setup 1.2 ns 



Input hold ns 



1.1ns 
-0.1 ns 



Output delay Tbedd + 0.4 ns or Tbedd + 0.2 ns or 
Tbddd + 0.4 ns^^ Tbddd + 0.2 ns^ "^ 



data_ram_oe_h, 
data_ram_we_h, 
tag_ram_oe_h, tag_ram_we_h ^ 

tag_data_h, tag_data_par_h, 
tag_valid_h 



Output delay Tdd + 0.4 ns^ 
Output hold Tmdd 

Output hold Tmdd 



Tdd + 0.2 ns"^ 



Tmdd^ 



Tmdd^ 



Tdsu 
Tdh 

Taod 

Taod 
Taoh 

Taoh 



^ Pulse width for this signal is controlled through the BC_CONFIG IPR. 

The value 0.4 ns accounts for onchip driver and clock skew. 

For big drive enabled or big drive disabled, respectively. See Table 29. 

The value 0.2 ns accounts for onchip driver and clock skew. 
^ For chip speeds greater than 500 MHz, Tmdd is 0.6 ns. 

11.4.5 Timing of Test Features 

Timing of 21164 testability features depends on the system clock rate and the test 
port's operating mode. This section provides timing information that may be needed 
for most common operations. 
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11.4.6 Icache BiSt Operation Timing 

The Icache BiSt is invoked by deasserting the external reset signal sys_reset_l. 
Figure 19 shows the timing between various events relevant to BiSt operations. 



Figure 19 BiSt Timing Event — Time Line 



Deassert BiSt Start 

sys_reset_l (test_status_h< 1 :0>=0 1 ) 



Deassert * 
Internal Reset BiSt Done 

(T%Z_RESET_B_L) (test_status_h<1 :0>=00) 



MK-1 455-09 

The timing for deassertion of internal reset (time t2, see asterisk) is valid only if an 
SROM is not present (indicated by keeping signal srom_present_l deasserted). If an 
SROM is present, the SROM load is performed once the BiSt completes. The inter- 
nal reset signal T%Z_RESET_B_L is extended until the end of the SROM load 
(Section 11.4.7 ). In this case, the end of the time line shown in Figure 19 connects 
to the beginning of the time line shown in Figure 20. 

Table 36 and Table 37 list timing shown in Figure 19 for some of the system clock 
ratios. Time t^ is measured starting from the rising edge of sysclk following the deas- 
sertion of the sys_reset_l signal. 

Table 36 BiSt Timing for Some System Clock Ratios, Port Mode=Normal 
(System Cycles) 



Ratio 


fi 


t2 


% 


3 


8 


22644 + 21/2 


22645 


4 


7 


19721 + 21/2 


19722 


15 


7 


13291 + 14>/2 


13292 
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Table 37 BiSt Timing for Some System Clock Ratios, Port Mode=Normal (CPU 
Cycles) 



Sysclk 



CPU Cycles 



Ratio 


fi 


t2 


h 


3 


24 


679341/2 


67935 


4 


28 


78886'/2 


78888 


15 


105 


1993791/2 


199380 



11.4.7 Automatic SROM Load Timing 

The SROM load is triggered by the conclusion of BiSt if srom_present_l is asserted. 
The SROM load occurs at the internal cycle time of approximately 126 CPU cycles 
for srom_clk_h, but the behavior at the pins may shift slightly. 

Timing events are shown in Figure 20 and are listed in Table 38 and Table 39. 
Figure 20 SROM Load Timing Event — Time Line 



BiSt Done 

(test_status_h 

<1 :0>=00) 



Assert First Rise 

srom oe I srom elk h 



Deassert 
Last Rise Internal Reset Deassert 

srom_clk_h (T%Z RESET B L) srom oe I 



MK-1 455-10 



Table 38 SROM Load Timing for Some System Clock Ratios (System Cycles) 



Sysclk 



System Cycles^ 



Ratio 


h 


t2 


h 


U 


% 


3 


4 


22 


4408090 


4408216 + Vi 


4408217 


4 


3 


48 


3306099 


3306193 + 2'/2 


3306194 


15 


3 


13 


881627 


881651 + 9'/2 


881652 



Measured in sysclk cycles, where "+ n'' refers to an additional n CPU cycles. 
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Table 39 SROM Load Timing for Some System Clock Ratios (CPU Cycles) 



Sysclk 



CPU Cycles 



Ratio 


h 


t2 


k 


k 


% 


3 


12 


66 


13224270 


132246481/2 


13224651 


4 


12 


192 


13224396 


132247741/2 


13224776 


15 


45 


195 


13224405 


132247741/2 


13224780 



Figure 21 is a timing diagram of an SROM load sequence. 



Figure 21 Serial ROM Load Timing 



sys_reset_l 



srom oe I 



srom elk h 



srom data h 



'A 



VA 



VA 



\ 



VA 



f 




tsu = 4 X sysclk period + 1.1 ns 
tho = ns 



102,400 Bits Total 



MK-1 455-07 



The minimum srom_clk_h cycle = (126 - sysclk ratio) x (CPU cycle time). 

The maximum srom_clk_h to srom_data_h delay allowable (in order to meet the 
required setup time) = [126 - (5 x sysclk ratio)] x (CPU cycle time). 

11.4.8 Clock Test Modes 

This section describes the 21164 clock test modes. 
11.4.8.1 Normal (1x Clock) Mode 

When clk_mode_h<2:0> = 101, the osc_clk_in_h,l frequency is not divided and a 
clock equalizing circuit (called a symmetrator) is enabled. The symmetrator equal- 
izes the duty-cycle of the input clock for use onchip. The osc_clk_ in_h,l signals 
must have a duty cycle of at least 60/40 for the symmetrator to work properly. This is 
the preferred clocking mode of the 21164. 
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11.4.8.2 2x Clock Mode 



When clk_mode_h<2:0> = 000, the osc_clk_in_h,l frequency is divided by 2. The 
osc_clk_in_h,l signals must have a duty cycle of at least 60/40. 



11.4.8.3 Chip Test Mode 

To lower the maximum frequency that the chip manufacturing tester is required to 
supply, a divide-by- 1 mode has been designed into the clock generator circuitry. 
When clk_ mode_h<2:0> = 001, the clock frequency that is applied to the input 
clock signals osc_clk_in_h,l bypasses the clock divider and is sent to the chip clock 
driver. This allows the chip internal circuitry to be tested at full speed with a one-half 
frequency osc_clk_in_h,l. 

Note: The clock symmetrator is not enabled in this mode. 

11.4.8.4 Module Test Mode 

When clk_mode_h<2:0> = 010, the clock frequency that is applied to the input 
clock signals osc_clk_in_h,l is divided by 4 and is sent to the chip clock driver. The 
digital phase-locked loop (DPLL) continues to keep the onchip sys_clk_outl_h,l 
locked to ref_clk_in_h within the normal limits if a ref_clk_in_h signal is applied 
(0 ns to 1 osc_clk_in_h,l cycle after ref_clk_in_h). 

11.4.8.5 Clock Test Reset Mode 

When clk_mode_h<2:0> = Oil, the sys_clk_out generator circuit is forced to reset 
to a known state. This allows the chip manufacturing tester to synchronize the chip to 
the tester cycle. Table 40 lists the clock test modes. 

Table 40 Clock Test Modes (Sheet 1 of 2) 

clk_mode_h 
Mode <2> <1> <0> 



Normal (Ix) clock mode 


1 





1 


2x clock mode 











Chip test 








1 


Module test 





1 
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Table 40 CU 


Dck Test Modes 


(Sheet 2 of 2) 




clk_mode_h 




Mode 


<2> <1> <0> 





Clock reset 1 1 

Not valid 1 

Not valid 1 1 x 



11-4.9 IEEE 1149.1 (JTAG) Performance 

Table 41 lists the standard mandated performance specifications for the IEEE 1149.1 
circuits. 

Table 41 IEEE 11 49.1 Circuit Performance Specifications 

Item Specification 

trst_l is asynchronous. Minimum pulse width. 4 ns 

trst_l setup time for deassertion before a transition on tck_h. 4 ns 

Maximum acceptable tck_h clock frequency. 16.6 MHz 

tdi_h/tms_h setup time (referenced to tck_h rising edge). 4 ns 

tdi_h/tms_h hold time (referenced to tck_h rising edge). 4 ns 

Maximum propagation delay at pin tdo_h (referenced to tck_h falling 14 ns 

edge). 

Maximum propagation delay at system output pins (referenced to tck_h 20 ns 
falling edge). 

11.5 Power Supply Considerations 

For correct operation of the 21 164, all of the Vss pins must be connected to ground, 
all of the Vdd pins must be connected to a 3.3-V ±5% power source, and all of the 
Vddi pins must be connected to a 2.5-V ±0.1 V power source. This source voltage 
should be guaranteed (even under transient conditions) at the 21 164 pins, and not just 
at the PCB edge. 

Plus 5 V is not used in the 21164. The voltage difference between the Vdd pins and 
Vss pins must never be greater than 3.46 Y, and the voltage difference between the 
Vddi pins and Vss pins must never be greater than 2.6 Y. If the differentials exceed 
these limits, the 21164 chip will be damaged. 
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11.5.1 Decoupling 

The effectiveness of decoupling capacitors depends on the amount of inductance 
placed in series with them. The inductance depends both on the capacitor style (con- 
struction) and on the module design. In general, the use of small, high-frequency 
capacitors placed close to the chip package's power and ground pins with very short 
module etch will give best results. Depending on the user's power supply and power 
supply distribution system, bulk decoupling may also be required on the module. 

The 21 164 requires two sets of decoupling capacitors: one for Vdd and one for Vddi. 

11.5.1.1 Vdd Decoupling 

The amount of decoupling capacitance connected between Vdd and Vss should be 
roughly equal to 10 times the amount of capacitive load that the 21164 is required to 
drive at any one time. This should guarantee a voltage drop of no more than 10% on 
Vdd during heavy drive conditions. 

Use capacitors that are as physically small as possible. Connect the capacitors 
directly to the 21164 Vdd and Vss pins by short surface etch (0.64 cm [0.25 in] or 
less). The small capacitors generally have better electrical characteristics than the 
larger units and will more readily fit close to the IPGA pin field. 

When designing the placement of decoupling capacitors, Vdd decoupling capacitors 
should be favored over Vddi decoupling capacitors (that is, Vdd capacitors should 
be placed closer to the 21164 than the Vddi capacitors). 

11.5.1.2 Vddi Decoupling 

Each individual case must be separately analyzed, but generally designers should 
plan to use at least 4 jiF of capacitance connected between Vddi and Vss. Typically, 
30 to 40 small, high-frequency 0.1 -jiF capacitors are placed near the chip's Vddi and 
Vss pins. Actually placing the capacitors in the pin field is the best approach. Several 
tens of jLiF of bulk decoupling (comprised of tantalum and ceramic capacitors) should 
be positioned near the 21164 chip. 

Use capacitors that are as physically small as possible. Connect the capacitors 
directly to the 21164 Vddi and Vss pins by short surface etch (0.64 cm [0.25 in] or 
less). The small capacitors generally have better electrical characteristics than the 
larger units, and will more readily fit close to the IPGA pin field. 
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11.5.2 Power Supply Sequencing 

When applying or removing power to the 21164, Vdd (the 3.3-V supply voltage) 
must be no less than Vddi (the 2.2-V supply voltage). 

The following rules must be followed when either applying or removing the supply 
voltages: 

1. Vdd must always be at the same or a higher voltage than Vddi during normal 
operation. 

2. The signal voltage must not exceed Vclamp. 

3. The signal voltage must not be more than 2.4 Y higher than Vddi. 

Rule 1 means that either Vdd and Vddi can be brought up and down in unison or 
Vddi can be applied after and removed before Vdd. 

Rule 2 means that the signal voltage must not be allowed to exceed Vclamp during 
the application or removal of power. Refer to Table 25 for the value of Vclamp. Note 
that it is acceptable for the signal voltage either to be held at zero or to follow Vdd 
during the application or removal of power. 

Rule 3 means that, if the signal voltage follows Vdd, the signal voltage must never 
be greater than 2.4 Y above the value of Vddi. This applies equally during the appli- 
cation or the removal of power. 

Note that if the signal voltage is held at Y during power-up reset (that is, the ASICs 
and SRAMs are set to drive Y during reset), Vdd and Vddi can be brought up 
together. In a similar manner, the power-down situation can be managed if the signal 
voltages are forced to Y when the loss of Vddi is detected. 

During power-up, Vddi can momentarily exceed the maximum steady-state value 
under the following conditions: 

• The transient voltage is 200 mY or less. 

• The transient period lasts for 200 |is or less. 

The transient voltage is defined as the voltage that rises above the maximum-allowed 
steady-state value. The transient period is defined as the time beginning when the 
transient voltage exceeds the steady-state value and ending when it falls back to it. 

There is no derating for shorter transient periods or lower transient voltages (for 
example, a 400-mY transient voltage lasting for 100 jlis is not acceptable). 
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All input and bidirectional signals are diode-clamped to Vdd and Vss. A current 
greater than Iclamp on an individual pin could damage the 21164. Designers must 
take care that currents greater than Iclamp will not be achieved during power-supply 
sequencing. While currents less than Iclamp will not damage the 21164, other 
source drivers connected to the 21164 could be damaged by the clamp. Designers 
must verify that the source drivers will not be damaged by currents up to Iclamp. 
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12Thermal Management 



This section describes the 21164 thermal management and thermal design consider- 
ations. 

12.1 Operating Temperature 

The 21164 is specified to operate at the following temperatures at the center of the 
heat sink (T^): 

72.6°C for 366 MHz 

70.6°C for 433 MHz 

68.6°C for 500 MHz 

67.6°C for 533 MHz 

65.6°C for 600 MHz 

Temperature (T^) should be measured at the center of the heat sink (between the two 
package studs). The GRAFOIL pad is the interface material between the package 
and the heat sink. 

Table 42 lists the values for the center of heat- sink-to-ambient (0^^) fc)r the 499-pin 
grid array. Table 43 shows the allowable T^ (without exceeding T^) at various air- 
flows. 

Note: DIGITAL recommends using the heat sink because it greatly improves 

the ambient temperature requirement. 

Table 42 0^a at Various Airflows 







Airflow (linear ft/min) 






100 


200 400 


600 


800 


1000 


Frequency: 366 MHz, 433 MHz, 


500, 533, 


and 600 MHz 








@^a with heat sink 1 (°CAV) 


2.30 


1.30 0.70 


0.53 


0.45 


0.41 


&^a with heat sink 2 (°CAV) 


1.25 


0.75 0.48 


0.40 


0.35 


0.32 
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Table 43 Maximum T^ at Various Airflows 



Airflow (linear ft/min) 
100 200 400 600 800 1000 



Frequency: 366 MHz, Power: 31 W @Vdd = 3.3 V and @Vddi = 2.5 V 

T^ with heat sink 1 CQ — 32.3 50.9 56.2 58.7 59.9 

T^ with heat sink 2 CQ 33.9 49.4 57.7 60.2 61.8 62.7 

Frequency: 433 MHz, Power: 36 W @Vdd = 3.3 V and @Vddi = 2.5 V 

T^ with heat sink 1 CC) — 23.8 45.4 51.5 54.4 55.8 

T^ with heat sink 2 (°C) 25.6 43.6 53.3 56.2 58.0 59.1 

Frequency: 500 MHz, Power: 41 W @Vdd = 3.3 V and @Vddi = 2.5 V 

T^ with heat sink 1 CC) — — 39.9 46.9 50.2 51.2 

T^ with heat sink 2 (°C) — 37.9 48.9 52.2 54.3 55.5 

Frequency: 533 MHz, Power: 43.5 W @Vdd = 3.3 V and @Vddi = 2.5 V 

\ with heat sink 1 (°C) — — 37.2 44.5 48.1 49.8 

T^ with heat sink 2 (°C) — 35.0 46.7 50.2 52.4 53.7 

Frequency: 600 MHz, Power: 48.5 W @Vdd = 3.3 V and @Vddi = 2.5 V 

T^ with heat sink 1 rC) — — 31.7 39.9 43.8 45.7 

T^ with heat sink 2 (°C) — 29.3 42.4 46.2 48.7 50.1 
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Heat-Sink Specifications 



12.2 Heat-Sink Specifications 

Two heat sinks are specified. Heat sink type 1 mounting holes are in line with the 
cooling fins. Heat sink type 2 mounting holes are rotated 90° from the cooling fins. 
The heat sink composition is aluminum alloy 6063. Type 1 heat sink is shown in 
Figure 22, and type 2 heat sink is shown in Figure 23, along with their approximate 
dimensions. 

Figure 22 Type 1 Heat Sink 
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Thermal Design Considerations 



Figure 23 Type 2 Heat Sink 
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12.3 Thermal Design Considerations 

Follow these guidelines for printed circuit board (PCB) component placement: 

• Orient the 21164 on the PCB with the heat-sink fins aligned with the airflow 
direction. 

• Avoid preheating ambient air. Place the 21164 on the PCB so that inlet air is not 
preheated by any other PCB components. 

• Do not place other high-power devices in the vicinity of the 21 164. 

• Do not restrict the airflow across the 21 164 heat sink. Placement of other devices 
must allow for maximum system airflow in order to maximize the performance 
of the heat sink. 
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13 Mechanical Specifications 



This section shows the 21164 mechanical packaging dimensions without a heat sink. 
For heat sink dimensions, refer to Section 12. 

Package Dimensions 

Figure 24 shows the package physical dimensions without a heat sink. 
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Figure 24 Package Dimensions 
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A Products and Documentation 



To view current product update and errata revision information, visit the Alpha OEM 
World Wide Web Internet site. You can also visit this website for information about 
other Alpha microprocessors or help deciding which documentation best meets your 
needs: 

http://www.digital.com/semiconductor/alpha/alpha.htm 

- For documentation needs, click on Technical Information. 

- For product update information or information about other Alpha micropro- 
cessors, click on Microprocessor Products. 

DIGITAL Alpha Products 

To order DIGITAL Alpha 21164 microprocessors, contact your local distributor. The 
following table lists some of the Alpha microprocessors available from DIGITAL. 

Chips Order Number 

Alpha 21164 533-MHz microprocessor for NT only 21164-P8 

Alpha 2 1 1 64 500-MHz microprocessor 2 1 1 64- JC 

Alpha 21164 533-MHz microprocessor 21164-KC 

Alpha 2 1 1 64 600-MHz microprocessor 2 1 1 64-MC 

DIGITAL Documentation 

The following table lists some of the available DIGITAL documentation. 

Title Order Number 

Alpha Architecture Reference Manual^ EY-W938E-DP 

Alpha Architecture Handbook^ EC-QD2KB-TE 

DIGITAL Alpha 2 1 1 64 Microprocessor Hardware Reference Manual EC-QP99C-TE 

DIGITAL Alpha 2 1 1 64 Microprocessor Product Brief EC-QP97D-TE 

DIGITAL 21 172 Core Logic Chipset Product Brief EC-QUQHA-TE 

DIGITAL 21 172 Core Logic Chipset Technical Reference Manual EC-QUQJA-TE 

Answers to Common Questions about PALcode for Alpha AXP EC-N0647-72 

Systems 
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Title 



Order Number 



PALcode for Alpha Microprocessors System Design Guide 

Alpha Microprocessors Motherboard Windows NT 3.5 land 4.0 
Installation Guide 

SPICE Models for Alpha Microprocessors: An Application Note 

Alpha Microprocessors SROM Mini-Debugger User's Guide 

Alpha Microprocessors Motherboard Debug Monitor User's Guide 

Alpha Microprocessors Motherboard Software Design Tools User's 
Guide 

To purchase the Alpha Architecture Reference Manual, contact your local distributor or call Butter- 

worth-Heinemann (Digital Press) at 1-800-366-2665. 

This handbook provides information subsequent to the Alpha Architecture Reference Manual. 



EC-QFGLC-TE 
EC-QLUAH-TE 

EC-QA4XG-TE 
EC-QHUXC-TE 
EC-QHUVF-TE 
EC-QHUWD-TE 



Third-Party Documentation 

You can order the following third-party documentation directly from the vendor. 



Title 



Vendor 



PCI Local Bus Specification, Revision 2.1 
PCI System Design Guide 



PCI Special Interest Group 
U.S. 1-800-433-5177 

International 1-503-797-4207 
FAX 1-503-234-6762 



IEEE Standard 754, Standard for Binary Floating-Point The Institute of Electrical and 
Arithmetic Electronics Engineers, Inc. 

U.S. 1-800-701-4333 

IEEE Standard 1 149. 1 , A Test Access Port and Boundary International 1-908-98 1-0060 
Scan Architecture FAX 1-908-981-9667 
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